RE: finding rows in a large file (22 millions of rows)
Madhu Reddy Wrote: We are trying to load date into teradata [which is data warehousing, stores Terabytes of data, and which is 10 times faster than any other database..) Data warehousing is always an exciting subject! However, I'd be surprised to see this kind of performance increase. A major factor in database performance is the database design. Many database designers do not know how to build data warehouses, they are stuck on normal relational concepts. Anyway, sorry to be off topic... I just can't turn down a database debate! :) before loading data into Teradata, we need to do some massaging on data..basically eliminating..duplicate rows and invalid rows... I don't know anything about the Teradata database system, but I know how I would do this on other systems: 1. Load the data as it is into a temporary database 2. Do a select (or a report), returning unique (distinct) rows. This same select could also filter out your invalid rows and massage data. 3. Load the result of the select into the final database. If you are really looking to do this with Perl, I guess you load the data into a hash, sort it, and then print the unique values. I have no idea how long this would take to run, but the code would be fairly straight-forward: Just load the data into a hash using each column as a key. Then sort the hash (this may take a little while). Finally, write a conditional that cycles through the hash, checking the first key. If the hash record you last read is the same as the current one, don't print it to a file. Otherwise, do print it to a file. At this point you could also do some formatting, etc. I guess it all just depends on which you are more comfortable with. Hope this helps, Jared -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Formatting Variables.
Ramón Chávez wrote: I mean, if I don't want to get printed 3.1415926535 (Or any irrational number) but something like 3.14, is there a way to use format?? I agree with the other posts. Use printf. Here is some more reading, to check out: perldoc -q long decimals perldoc -q round Hope this helps, Jared -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: still needing help
Warning: opinionated text follows, so please don't take offense :) stepping onto soapbox whatever it gets text where it needs to go... and if all you need is text the form parser below is fine.. also...if your not offering any real help...maybe you can keep your comments to yourself :) I hate people who answer questions with no or you cant do that or the like..it freakin lame...maybe you could tell him why you think this form parser is broken and actully really help someone...ok so maybe stop being so freakin cool for just a sec and try to help sure =0).. Im no Randal Schwartz: I kept telling myself that I wasn't going to get involved... Oh well. Although I think Jdavis was a little too harsh, I have to agree with him (although, not as adamantly). First off, I also admit I'm no Perl guru, but I'm learning. I think Jdavis was saying that _reasons_ why something is broken (or doesn't work, or is bad, or whatever), are helpful to beginners. In fact, I occasionally find myself frustrated with the brevity of many responses to people's questions. I think a lot of people are using this list to learn, not just to be told what to do. I'm not saying to write a novel out of each response, but a little detail can be nice. You have to remember, a lot of people who are learning Perl (and even many who are learning English) are using this list. I dont think the parser is broken, I KNOW it is ;0). Among other things, this: @in = split(//,$in); is 'bad, bad, bad, bad, ' x 100_000_000 Why is this bad? Don't get me wrong... I'm not saying you are incorrect, because frankly I don't know. Is it because he is using a scalar with the same name as the array he is assigning it to? Oh well, I don't even remember the rest of the code that was posted :) I cant even use programs that use that parser on my RH konquerer or Mozilla Why does it cause problems with Konquerer or Mozilla? I guess I've always been the type to question someone else's opinions. :) There is really no offense intended. I'm just hoping to keep people's minds open and ease tensions. Lets try to keep this list as helpful as possible. stepping down from soapbox Jared -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: how do i get rid of and , chars ??????
Swami wrote: I am reading a line from a file and splitting it into a 2 dimensional array, this is no probs.. BUT i want to get rid ofand , out of each line - how do i do this ??? You can use the transliteration operator for this. You will have to use the d modifier to tell it to delete the characters you specify. Just put this into your code: while ($line=INFILE) # This is where you're reading in the file { chop $line; # this is the transliteration. Look for anyor , characters # and delete them. $line=~tr/,//d; # now, you can do your splitting, etc... } If you prefer, you could of course use the tr on the array elements, after it was split. Regards, Jared -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Getting Perl
Scott Barnett wrote: Is there a free version of Perl that I can get that will run on Win98 machine. I want to start learning Perl. checked ActiveState but it looks like that is only a 15 or 30 day evaluation, I may be wrong? I assume you want a binary, not the source code? Well, check out this link, it is the CPAN Perl ports page (Windows binaries section). http://www.cpan.org/ports/#win32 I'm personally using SiePerl. It seems to be able to do everything I need. :) If you go to the link, there is installation documentation and a list of modules. You probably want the 5.8 version. Hope this helps, Jared -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Size of number in scalar
Chris Said: Someone posted a question as to the size of number which a scalar would tolerate. I guess I missed this thread, so I hope I'm not repeating information. :) Perl seems to tolerate quite a bit of this, as the app has been churning away, printing every so many number just to let me know where it's at, and it recently went past two hundred billion (I cheated and am incrementing by 1000 instead of 1, because I got impatient incrementing by one). From the O'Reilly Camel Book: Perl stores numbers as signed integers if possible, or as double-precision floating point values in the machine's native format otherwise $scalars in perl handle big numbers ... and maybe Perl notices when a boundary is being crossed and reconstitutes the number? Also, see perldoc perlnumber for more information. It explains a lot of this. Sorry if I'm repeating information, Jared -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: printing number with commas in it
Reggie Wrote: I am trying to print a number with commas in it. I cannot find the correct syntax to do this with printf. I considered using the substr function but this depends on mealways knowing the size of the number. Can you help me with this? I like to use the method listed in the perldocs. Try this: perldoc -q output my numbers with commas It lists a really cool solution, although I'm sure there are plenty of others :) It credits Benjamin Goldberg with this: s/(^[-+]?\d+?(?=(?(?:\d{3})+)(?!\d))|\G\d{3}(?=\d))/$1,/g; Hope this will work for you, Jared -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: uploading and downloading files to MySQL (--2--)
Mariusz wrote: What type of field should I use for storing the path; just VARCHAR I guess? And as far as the filenames - make up some random file name for each submitted file? This is a little off topic for a Perl list, but I'll give it a shot. Be forewarned, that I do not know much about MySQL. I am writing this based on my experiences with Oracle and other DB languages. It all depends on how you want to handle it. Varchar would work fine for storing a path. If you wanted to, you could also come up with a random name for the file. Of course, you would have to do some checking to ensure the file name was not in use, or maybe just use a fancy directory naming scheme. ps. If storing files in the DB is not common, what exactly is the BLOB type for? Storing files in the DB is not common, but it is used in some cases. Depending on your application, it may be appropriate. There are several advantages to storing these types of files in the database: 1. You don't have to worry about file names 2. You get an indexed search to find the file itself. In other words, you don't need another system call to get the file off of the file system and then return it. 3. Maybe faster retrieval times (depending on how your hardware and DB are setup). Not real likely unless you have a lot of money to work with. I'm talking Oracle with multiple servers, etc. 4. Restoring the database will restore the files within it 5. All of the database features themselves would apply to the files. For example, greater control over security. Of course, there are also disadvantages: 1. Much larger database (this is a big disadvantage) 2. Probably slower data retrieval times (once again, depending on how things are set up). This would probably be the case for you. 3. BLOBS, GLOBS, etc are typically much harder to work with I'm sure there are many other advantages/disadvantages that I could not think of off hand. Just be aware that in most situations, as a general guideline, you want to store the path to a file, not the file itself. I highly suggest trying it both ways (although I know that sometimes that is impossible). But, never take anyone's word for it! :) Also, try googling for database storing files or database file storage or something like that. I think you will find a lot of useful information. Hope this helps, Jared -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Beginners -- a suggestion
Just to help out Huan Huang below: Huan Huang wrote: I have set up Activeperl in my windows NT system. I have some source file. But I really get no idea how to run the source file. Could anybody give me a go? You basically need to associate the file type with the perl interpreter. I am not using NT, so I can't tell you the exact steps, but you basically want to open explorer. When you double-click a .pl file, it will ask you what to run it with. Specify the perl.exe file from the path where it was installed. Also, you want to make sure you add the location of the perl.exe file and your perl/bin directory to your path. As far as good tutorials... It depends on what you need help with. But, if you need help learning Perl, you might find these helpful: http://www.comp.leeds.ac.uk/Perl/start.html http://www.cclabs.missouri.edu/things/instruction/perl/perlcourse.html I'm not sure, but I think I got them from the perlfax (even if not, you'll want to read the fax): http://www.perldoc.com/perl5.8.0/pod/perlfaq.html Hope this helps, Jared -Original Message- From: Zemer Rick [mailto:[EMAIL PROTECTED]] Sent: Thursday, November 21, 2002 9:20 AM To: [EMAIL PROTECTED] Subject: Beginners -- a suggestion I too am rather new to Perl, and have found the OpenPerlIDE debugger to be helpful. It is available on sourceforge.net and probably elsewhere. Hope that helps. -rz. She was trying to construct a life that made sense from things she found in gift shops. -kv. -Original Message- From: Jim Blanchard [mailto:[EMAIL PROTECTED]] Sent: Thursday, November 21, 2002 11:02 AM To: 'Sorin Marti'; [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: RE: Email To Mysql. I am new too! I have tried Activeperl on my XP machine at home and found that you have to run the version I have in a command window. c:perl file name I hope this helps. Jim Blanchard -Original Message- From: Sorin Marti [mailto:[EMAIL PROTECTED]] Sent: Thursday, November 21, 2002 10:49 AM To: [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: Email To Mysql. [EMAIL PROTECTED] wrote: Hi, I am really new to Perl. me too I have set up Activeperl in my windows NT system. I have some source file. But I really get no idea how to run the source file. Could anybody give me a go? try this: http://www.wdvl.com/Authoring/Languages/Perl/Windows/ I have checked the following sites but found few introduction to windows user. http://www.perldoc.com/perl5.8.0/pod/perl.html#NAME could anybody give me some helpful websites as well? thanks a lot! Huan Huang Please choose a good subject to mail at a mailing list. Your message has nothing to do with Email To Mysql so choose a Subject like Perl documentation for windows user so everyone knows what you're talking about Greets Sorin -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] Confidentiality Notice: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. PARTNERS Health PlanPhone: 574-233-4899 100 E. Wayne St., Suite 502 Fax: 574-234-7484 South Bend, IN 46601www.partnersindiana.com Confidentiality Notice: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. PARTNERS Health PlanPhone: 574-233-4899 100 E. Wayne St., Suite 502 Fax: 574-234-7484 South Bend, IN 46601www.partnersindiana.com -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Working with text files.Please help!
Hey Vitali, I am also a beginner, so I don't know if I am doing everything most efficiently... but, here is what I wrote to do this. I wouldn't normally just write a program for someone, but the problem intrigued me. Take a look through the code and it will hopefully make sense. I hope it will at least get you going on the right road! Jared begin code $infile = input_file.txt; if (! -e $infile) # if the input file doesn't exist, die { die File $infile not found!; } open(JARREAD,$infile); # open the file with a filehandle I made up while ($line=JARREAD) # reads each line of the file until done. notice = { chop $line; if (substr($line,0,2) eq 10) { $outfile=substr($line,3,4)..out; # grab the file name and append a .out if ($prefix 0) # if there is no prefix, this is the first time through. { close(JARWRITE); } $prefix=substr($line,0,2); # prefix is what we add to the beggining of each line. open(JARWRITE, , $outfile); print {JARWRITE} $line\n; # print the first line with no prefix } else { if (substr($line,0,2) 10) # this says it found the next section ie 20 { $prefix=substr($line,0,1); # make the prefix 1 character # for each section label 10,20,etc, do not do a prefix print {JARWRITE} $line\n; } elsif (substr($line,0,2) eq AB or substr($line,0,2) eq EP or substr($line,0,2) eq A0) { # if we found one of the above sections titles, just spit out the line print {JARWRITE} $line\n; } else { # this substr is because the first char is a space for all other lines $line=substr($line,1); print {JARWRITE} $prefix.$line\n; } } } close(JARREAD); #close the files close(JARWRITE); end code -Original Message- From: Vitali Pokrovski [mailto:[EMAIL PROTECTED]] Sent: Thursday, November 21, 2002 2:39 AM To: [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Working with text files.Please help! Dear friends, could you please show me the way,how to write the code for converting data from input_file to output files (1001.out,1002.out,1003.out) format.(please see an example attached) I have already doing with this code writing about four week,but unfortunatlly..,becouse i'm just beginner:-( I have a text file (input_file.txt). And output i need are files 1001.out;1002.out;1003.out. Here are steps what programm must do: 1.Open input file 2.reads lines from canal 10- to canal A0 (info from line 10 to line A0 must be one file) 3.convert lines(canal numbers 10,20,30 ... on two first position) as is show in output files. For example: 10 20 20 20 20 30 30 30 4.and save this data to new file(1001.out). P.S. File names coming from lines 10 1001 ,acctually file names is from 3 position for characters,in this case is 1001 or 1002 or 1003.. I'm not sure what I'm trying to do here.. #!perl -w $count = 0; while (STDIN) { if (substr($_,0,2) eq 10) { #If left 2 ++$count; $lines = 0; open OUT, $count.out; print OUT $_; }; } __EOF__ Any help welcome! Regards, VItali Estonia,Tallinn IncrediMail - Email has finally evolved - Click Here -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]