Hi people. I am having some trouble with the PREG functions in php. Here's what I am trying to do...
First of all I am reading in a file which is 1.5mb's in size, it could be many more, going up to 8mb's, the contents of the file is input to a string. The format of the file is as follows... # # # "quoted text" "quoted text" # # the # represents a number, in the case of the first 3 numbers they are only ever 1 or 2 digits long. The final two digits can get to be rather big in size, thousands and millions. Each element is seperated by a tab space and then a carriage return (\r) terminates each record. I use preg_match_all to find all the lines that start with 1 and 1 as there first numbers, typically there will be 25 entries of 1 1. So I am looking for all lines in this format: 1 1 # "quoted text" "quoted text" # # I have the search pattern figured out, it is as follow: preg_match_all("/($first)\t($second)\t([0-9]{1,2})\t\"([^\"]*)\"\t\"([^\"]*)\"\t([0-9]*)\t([0-9]*)\r/", $input, $output, PREG_SET_ORDER ); When this pattern finds a matching line beginning equal to $first and $second it will put all the elements of the record into the array $output. $output[0] being the array of the first elements found, $array[1] being the second line that was matched, and so on. This pattern does actually work to some extent. When the filesize is low (100kb) it works fine, but when I start to get over that filesize it becomes greedy and the $second value doesnt seem to be taken into account when it searchs. It seems to return everything that equals the following: 1 # # "quoted text" "quoted text" # # Obviously not what I want. Could this be some sort of overflow problem? I am at a lost end here, so if anyone could offer some insight as to why it is not functioning correctly I would most welcome it. Overwise the only solution I can think of is chopping up the input, I dont really want to go down that path, as it seems like a rather cheap workaround. Thanks. Matt