Not signed in (Sign In)

Vanilla 1.1.4 is a product of Lussumo. More Information: Documentation, Community Support.

    • CommentAuthorkevin
    • CommentTimeDec 13th 2006
     
    Hellow i have 2 files with in each file 40.000 lines with adresses. Now i want to compare this file and find adresses that are in both files (duplicates) and place this adresses in a 3th file.

    how can i do this easy ? (i don't have much experience with programming
    • CommentAuthorgilray
    • CommentTimeDec 13th 2006
     
    one way would be to store the first file for example in a database and then compare the second file line for line against the database. Or you could just parse one of the files in an array and compare line by line against the other. But with 40,000 lines I'd recommend the DB option.
    • CommentAuthorjustnajm
    • CommentTimeDec 15th 2006
     
    Here is the code recently I have developed for the same purpose:
    just replace the filenames with your filenames.
    save this file as filehandle.php

    <?php
    $handle1 = @fopen("inputfile1.txt", "r");        //open file for read
        while (!feof($handle1)) {            // continue check the file end
            $line1 = fgets($handle1);        //read the line from file
            $handle2 = @fopen("inputfile2.txt", "r");
            while (!feof($handle2)){
                $line2 = fgets($handle2);
                if($line1 == $line2){

                $filename = 'thirdfile.txt';
                $somecontent = $line1;

                if (is_writable($filename)) {// Let's make sure the file exists and is writable first.
        
                if (!$handle3 = fopen($filename, 'a')) {//file open in append mode to add text on new line
                echo "Cannot open file ($filename)";
                exit;
                }

                if (!fwrite($handle3, $somecontent)) {
                echo "Cannot write to file ($filename)";
                exit;
                }
        
                echo "Success, wrote ($somecontent) to file ($filename)<br>";
        
                fclose($handle3);
                        
            } else {
            echo "The file $filename is not writable";
            }
        }
            }
            fclose($handle2);
        }
    fclose($handle1);
    ?>


    File used are:

    inputfile1.txt

    fakeaddress@hotmail.com
    thing21@hotmail.com
    playmode@hotmail.com
    fairyperson@yahoo.com
    MKDeception@yahoo.com

    inputfile2.txt

    fakeaddress@hotmail.com
    thing21@hotmail.com
    400@hotmail.com
    playmode@hotmail.com
    least2006@yahoo.com
    fairyperson@yahoo.com
    MKDeception@yahoo.com

    After comparison the same results written on this file:

    thirdfile.txt

    fakeaddress@hotmail.com
    thing21@hotmail.com
    playmode@hotmail.com
    fairyperson@yahoo.com
    MKDeception@yahoo.com

    you can also use the .doc files which provide high data storage capability, also take care that each address should on single line. These files were small therefore processed fast, long file like yours may take a little time and you should increase the server time for the file.