Hi Matt, first it might be easier to just use split here --
$fields = split("\t",$line); .... I say might because I suspect that the records you have that are converting incorrectly perhaps don't follow the tab convention? ... said another way, I've never seen preg break because the file got over a meg in size;) On Fri, Dec 07, 2001 at 11:50:51PM -0000, Matthew Moreton wrote: > Hi people. I am having some trouble with the PREG functions in php. > > Here's what I am trying to do... > > First of all I am reading in a file which is 1.5mb's in size, it could be many more, >going up to 8mb's, the contents of the file is input to a string. > > The format of the file is as follows... > > # # # "quoted text" "quoted text" # # > > the # represents a number, in the case of the first 3 numbers they are only ever 1 >or 2 digits long. The final two digits can get to be rather big in size, thousands >and millions. Each element is seperated by a tab space and then a carriage return >(\r) terminates each record. > > I use preg_match_all to find all the lines that start with 1 and 1 as there first >numbers, typically there will be 25 entries of 1 1. So I am looking for all lines in >this format: > > 1 1 # "quoted text" "quoted text" # # > > I have the search pattern figured out, it is as follow: > > >preg_match_all("/($first)\t($second)\t([0-9]{1,2})\t\"([^\"]*)\"\t\"([^\"]*)\"\t([0-9]*)\t([0-9]*)\r/", > $input, $output, PREG_SET_ORDER ); > > When this pattern finds a matching line beginning equal to $first and $second it >will put all the elements of the record into the array $output. $output[0] being the >array of the first elements found, $array[1] being the second line that was matched, >and so on. > > This pattern does actually work to some extent. When the filesize is low (100kb) it >works fine, but when I start to get over that filesize it becomes greedy and the >$second value doesnt seem to be taken into account when it searchs. It seems to >return everything that equals the following: > > 1 # # "quoted text" "quoted text" # # > > Obviously not what I want. Could this be some sort of overflow problem? I am at a >lost end here, so if anyone could offer some insight as to why it is not functioning >correctly I would most welcome it. Overwise the only solution I can think of is >chopping up the input, I dont really want to go down that path, as it seems like a >rather cheap workaround. > > Thanks. > > Matt -- Hank Marquardt <[EMAIL PROTECTED]> http://web.yerpso.net GPG Id: 2BB5E60C Fingerprint: D807 61BC FD18 370A AC1D 3EDF 2BF9 8A2D 2BB5 E60C *** Web Development: PHP, MySQL/PgSQL - Network Admin: Debian/FreeBSD *** PHP Instructor - Intnl. Webmasters Assn./HTML Writers Guild *** Beginning PHP -- Starts January 7, 2002 *** See http://www.hwg.org/services/classes -- PHP General Mailing List (http://www.php.net/) To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]