Re: Memory explodes loading CSV into hash
Kee Hinckley wrote: At 17:18 28.04.2002, Ernest Lergon wrote: Now I'm scared about the memory consumption: The CSV file has 14.000 records with 18 fields and a size of 2 MB (approx. 150 Bytes per record). Now a question I would like to ask: do you *need* to read the whole CSV info into memory? There are ways to overcome this. For example, looking at When I have a csv to play with and it's not up to being transfered to a real database I use the DBD CSV module which puts a nice sql wrapper around it. I've installed DBD::CSV and tested it with my data: $dbh = DBI-connect(DBI:CSV:csv_sep_char=\t;csv_eol=\n;csv_escape_char=) $dbh-{'csv_tables'}-{'foo'} = { 'file' = 'foo.data'}; 3 MB memory used. $sth = $dbh-prepare(SELECT * FROM foo); 3 MB memory used. $sth-execute(); 16 MB memory used! If I do it record by record like $sth = $dbh-prepare(SELECT * FROM foo WHERE id=?); than memory usage will grow query by query due to caching. Moreover it becomes VERY slow because of reading every time the whole file again; an index can't be created/used. No win :-( Ernest -- * * VIRTUALITAS Inc. * * ** * * European Consultant Office * http://www.virtualitas.net * * Internationales Handelszentrum * contact:Ernest Lergon * * Friedrichstraße 95 *mailto:[EMAIL PROTECTED] * * 10117 Berlin / Germany * ums:+49180528132130266 * * PGP-Key http://www.virtualitas.net/Ernest_Lergon.asc
Re: Memory explodes loading CSV into hash
Perrin Harkins wrote: $foo-{$i} = [ @record ]; You're creating 14000 arrays, and references to them (refs take up space too!). That's where the memory is going. See if you can use a more efficient data structure. For example, it takes less space to make 4 arrays with 14000 entries in each than to make 14000 arrays with 4 entries each. So I turned it around: $col holds now 18 arrays with 14000 entries each and prints the correct results: #!/usr/bin/perl -w $col = {}; $line = \t\t\t; # 4 string fields (4 chars) $line .= \t10.99x9; # 9 float fields (5 chars) $line .= \t . 'A'x17; # 5 string fields (rest) $line .= \t . 'B'x17; # $line .= \t . 'C'x17; # $line .= \t . 'D'x17; # $line .= \t . 'E'x17; # @record = split \t, $line; foreach $j ( 0 .. $#record ) { $col-{$j} = []; } for ( $i = 0; $i 14000; $i++ ) { map { $_++ } @record; foreach $j ( 0 .. $#record ) { push @ { $col-{$j} }, $record [$j]; } print $i\t$col-{0}-[$i],$col-{5}-[$i]\n unless $i % 1000; } ; 1; and gives: SIZE RSS SHARE 12364 12M 1044 Wow, 2 MB saved ;-)) I think, a reference is a pointer of 8 Bytes, so: 14.000 * 8 = approx. 112 KBytes - right? This doesn't explain the difference of 7 MB calculated and 14 MB measured. Ernest -- * * VIRTUALITAS Inc. * * ** * * European Consultant Office * http://www.virtualitas.net * * Internationales Handelszentrum * contact:Ernest Lergon * * Friedrichstraße 95 *mailto:[EMAIL PROTECTED] * * 10117 Berlin / Germany * ums:+49180528132130266 * * PGP-Key http://www.virtualitas.net/Ernest_Lergon.asc
Re: Memory explodes loading CSV into hash
Hi, thank you all for your hints, BUT (with capital letters ;-) I think, it's a question of speed: If I hold my data in a hash in memory, access should be faster than using any kind of external database. What makes me wonder is the extremely blown up size (mod)perl uses for datastructures. Ernest -- * * VIRTUALITAS Inc. * * ** * * European Consultant Office * http://www.virtualitas.net * * Internationales Handelszentrum * contact:Ernest Lergon * * Friedrichstraße 95 *mailto:[EMAIL PROTECTED] * * 10117 Berlin / Germany * ums:+49180528132130266 * * PGP-Key http://www.virtualitas.net/Ernest_Lergon.asc
Re: Memory explodes loading CSV into hash
Have you tried DBD::AnyData? It's pure Perl so it might not be as fast but you never know? -- Simon Oliver
Re: mod_bandwith like mechanism implemented in modperl?
Has anyone implemented a bandwidth limiting mechanism in mod_perl? Have you looked at mod_throttle? http://www.snert.com/Software/mod_throttle. There was a thread on this last week so if you want more information you might read through that. --Ade.
Re: mod_bandwith like mechanism implemented in modperl?
On Mon, Apr 29, 2002 at 07:49:33AM -0500, Ade Olonoh wrote: Has anyone implemented a bandwidth limiting mechanism in mod_perl? Have you looked at mod_throttle? I have. It does not work under load. At least, three months ago it didn't at all. Alex.
different type of login with Apache::AuthCookie?
We currently use Apache::AuthCookie for authentication/authorization, and it works great. However, we want to make a change to how the login works. In addition to having Apache::AuthCookie intercept requests for URL's that require auth/authz, we would like to provide a signon area on the main page where the user can proactively sign in. Would this be as simple as setting the same cookie (using the same format, obviously) as Apache::Authcookie is looking for when signon occurs on the front page? Or, better still, is there a way using A:A itself to do this? Thanks! -klm.
Re: [Q maybe OT] forward
At 07:15 29.04.2002, Martin Haase-Thomas wrote: Hi Andrew, thanx for the idea to have a look at Apache::ASP. I took that look meanwhile and to me that seems to be an overhead. Maybe I'm naive, because it wasn't much more than a glance, but the code copes with things a server page *never* has to worry about, things like session handling and so on. Apache::ASP looks more like a Java class packet (you know: one of these endless stories that java people use to wrap their code in - but I don't linme java, as you already may assume...) than a perl module. In my understanding a server page is nothing but a document that has to be processed by the server, and the result of this process is sent to the client. All the other aspects of a web application, like sessions or cacheing or th like ar not what the page itself has to care about. It either knows the respective values, because the handler passed them through to it - or it doesn't. But maybe I'm bragging now - wait a few weeks and we'll hopefully both see whether I'm right or not. Some people do programming inside JSP pages too, right? And Sun even says it's a good way to get started with web programming. Anyway, what you're looking for then is a simple templating module, you should look at Perrin Harkins' tutorial: http://perl.apache.org/preview/modperl-docs/dst_html/docs/2.0/world/templates/choosing.html . -- Per Einar Ellefsen [EMAIL PROTECTED]
Re: Memory explodes loading CSV into hash
Ernest Lergon wrote: So I turned it around: $col holds now 18 arrays with 14000 entries each and prints the correct results: ... and gives: SIZE RSS SHARE 12364 12M 1044 Wow, 2 MB saved ;-)) That's pretty good, but obviously not what you were after. I tried using the pre-size array syntax ($#array = 14000), but it didn't help any. Incidentally, that map statement in your script isn't doing anything that I can see. I think, a reference is a pointer of 8 Bytes, so: 14.000 * 8 = approx. 112 KBytes - right? Probably more. Perl data types are complex. They hold a lot of meta data (is the ref blessed, for example). This doesn't explain the difference of 7 MB calculated and 14 MB measured. The explanation of this is that perl uses a lot of memory. For one thing, it allocates RAM in buckets. When you hit the limit of the allocated memory, it grabs more, and I believe it grabs an amount in proportion to what you've already used. That means that as your structures get bigger, it grabs bigger chunks. The whole 12MB may not be in use, although perl has reserved it for possible use. (Grabbing memory byte by byte would be less wasteful, but much too slow.) The stuff in perldebguts is the best reference on this, and you've already looked at that. I think your original calculation failed to account for the fact that the minimum numbers given there for scalars are minimums (i.e. scalars with something in them won't be that small) and that you are accessing many of these in more than one way (i.e. as string, float, and integer), which increases their size. You can try playing with compile options (your choice of malloc affects this a little), but at this point it's probably not worth it. There's nothing wrong with 12MB of shared memory, as long as it stays shared. If that doesn't work for you, your only choice will be to trade some speed for reduced memory useage, by using a disk-based structure. At any rate, mod_perl doesn't seem to be at fault here. It's just a general perl issue. - Perrin
Re: different type of login with Apache::AuthCookie?
Have that proactive signin area forward to a page behind Apache::AuthCookie protection and then have that page forward them right back to where they were? If you don't have frames that would be pretty easy. -Fran Ken Miller wrote: We currently use Apache::AuthCookie for authentication/authorization, and it works great. However, we want to make a change to how the login works. In addition to having Apache::AuthCookie intercept requests for URL's that require auth/authz, we would like to provide a signon area on the main page where the user can proactively sign in. Would this be as simple as setting the same cookie (using the same format, obviously) as Apache::Authcookie is looking for when signon occurs on the front page? Or, better still, is there a way using A:A itself to do this? Thanks! -klm.
Help needed tracking down 'Callback called exit' problem
I have a problem I can't seem to track down, showing up in our logs is: Out of memory! Callback called exit. Typically there are two or three of these right after one another. Depending on server load they show up every 15min. to an hour. I followed the guidelines for allocating an emergency memory pool in the guide and using Apache::Debug but I wasn't able to get any information out of it. The biggest problem is that I cannot seem to reproduce the error. I parsed the URL's out of the access log and ran all of the requests against our staging server, the error never appeared. Webservers are behind a load balancer, running Apache 1.3.23/mod_proxy proxying requests to mod_perl (Apache 1.3.23/mod_perl 1.26) server running on the localhost. This is all running on FreeBSD 4.5-STABLE. There's plenty of free memory, I also turned on process accounting and never saw a mod_perl httpd grow above 31mb. The load average on the boxes never gets above 0.05. Is there anyway to have the parent apache process log all creations/exits of the children? This way I could setup an access log with the PID of each child and then trace back all requests served after it's death. Any help greatly appreciated.
Re: Help needed tracking down 'Callback called exit' problem
Is there anyway to have the parent apache process log all creations/exits of the children? This way I could setup an access log with the PID of each child and then trace back all requests served after it's death. recipe 17.5 in the cookbook describes how to do this. basically you can hook into the PerlChildInitHandler and PerlChildExitHandler phases with some sort of marking mechanism similar to the code here http://www.modperlcookbook.org/code/ch17/Cookbook/LogChildren.pm HTH --Geoff
Re: Memory explodes loading CSV into hash
Perrin Harkins wrote: [snip] Incidentally, that map statement in your script isn't doing anything that I can see. It simulates different values for each record - e.g.: $line = \t\t1000\t10.99; @record = split \t, $line; for ( $i = 0; $i 14000; $i++ ) { map { $_++ } @record; # $i=0@record=('AAAB','BBBC',1001,11.99); # $i=1@record=('AAAC','BBBD',1002,12.99); # $i=2@record=('AAAD','BBBE',1003,13.99); # etc. } [snip] Thanks for your explanations about perl's memory usage. At any rate, mod_perl doesn't seem to be at fault here. It's just a general perl issue. I think so, too. Ernest -- * * VIRTUALITAS Inc. * * ** * * European Consultant Office * http://www.virtualitas.net * * Internationales Handelszentrum * contact:Ernest Lergon * * Friedrichstraße 95 *mailto:[EMAIL PROTECTED] * * 10117 Berlin / Germany * ums:+49180528132130266 * * PGP-Key http://www.virtualitas.net/Ernest_Lergon.asc
Re: Help needed tracking down 'Callback called exit' problem
At 18:10 29.04.2002, Paul Dlug wrote: I have a problem I can't seem to track down, showing up in our logs is: Out of memory! Callback called exit. I don't know if it'll be of any help, but you might want to look in the guide: http://perl.apache.org/preview/modperl-docs/dst_html/docs/1.0/guide/troubleshooting.html#Callback_called_exit -- Per Einar Ellefsen [EMAIL PROTECTED]
Re: Basic usage of Apache::Session::Oracle
On Mon, 29 Apr 2002, F. Xavier Noria wrote: 3. Could one set up things in a way that allows the database to see the timestamps and program a trigger to delete old sessions? Or is there a standard idiom for doing this in a different way? thats what i usually do ... just add a column to the table named 'ts' with the default value of 'sysdate' then cull the values with a trigger (not the program!) if your cracking open someone else's code for something why not add a few other tid bits like the CGI::remote_user(), CGI::self_url() to it (as separate nullable columns) as well if($debug) so that you can easily track down any problems that might be occuring when you interface with your own CGI's. --- Gabriel Millerd | Life can be so tragic -- you're Script Monkey |here today and here tomorrow.
Re: Memory explodes loading CSV into hash
Ernest Lergon wrote: Hi, thank you all for your hints, BUT (with capital letters ;-) I think, it's a question of speed: If I hold my data in a hash in memory, access should be faster than using any kind of external database. What makes me wonder is the extremely blown up size (mod)perl uses for datastructures. Looks like you've skipped over my suggestion to use Apache::Status. It uses B::Size and B::TerseSize to show you *exactly* how much memory each variable, opcode and what not uses. No need to guess. You can use B:: modules directly, but since you say that outside of mod_perl the memory usage pattern is different I'd suggest using Apache::Status. __ Stas BekmanJAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide --- http://perl.apache.org mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com http://modperlbook.org http://apache.org http://ticketmaster.com
RE: File::Redundant
Interesting ... not sure if implementing this in this fashion would be worth the overhead. If such a need exists I would imagine that would have choosen a more appropriate OS level solution. Think OpenAFS. It is always nice to use stuff that has ibm backing and likely has at least a professor or two and some grad students helping out on it. I had never heard of OpenAFS before your email. I will have to look into it a bit. My stuff would hopefully make it nice if you didn't want to change your os, or if you just wanted to make File::Redundant a small part of a much larger overall system. The biggest overheard I have seen is having to do readilnks. Maybe I could get around them somehow. I will have to draw up some uml or something to show how my whole system works. Earl
RE: File::Redundant
I would think it could be useful in non-mod_perl applications as well - you give an example of a user's mailbox. With scp it might be even more fun to have around :) (/me is thinking of config files and such) mod_perl works very well with the system for keeping track of what boxes are down, sizes of partitions and the like. However, a simple daemon would do about the same thing for say non-web based mail stuff. When I release I will likely have a daemon version as well as the mod_perl version, just using Net::Server. What's a `very large amount of data' ? We use it for tens of thousands of files, but most of those are small, and they certainly are all small on the 3 GB range. That is sort of the model for dirsync I think. Lots of small files in lots of different directories. Our NIS maps are on the order of 3 GB per file (64k users). Man, that is one big file. Guess dropping a note to this list sorta lets you know what you have to really scale to. Sounds like dirsync could use rsync if Rob makes a couple changes. Can't believe the file couldn't be broken up into smaller files. 3 GB for 64k users doesn't scale so hot for say a million users, but I have no idea about NIS maps, so there you go. Earl
Re: File::Redundant
This is OT for mod_perl, sorry... * Cahill, Earl [EMAIL PROTECTED] [2002-04-29 13:55]: Our NIS maps are on the order of 3 GB per file (64k users). Man, that is one big file. Guess dropping a note to this list sorta lets you know what you have to really scale to. Sounds like dirsync could use rsync if Rob makes a couple changes. Can't believe the file couldn't be broken up into smaller files. 3 GB for 64k users doesn't scale so hot for say a million users, but I have no idea about NIS maps, so there you go. I haven't been following the conversation, for the most part, but this part caught my eye. It is possible to split a NIS map up into many small source files, as long as when you change one of them you recreate the map in question as a whole. I've seen places with large NIS maps (although not 3GB) split the map up into smaller files, where each letter of the alphabet has it's own file in a designated subdirectory and a UID generator is used to get the next UID. When the NIS maps have to be rebuilt, the main map file is rebuilt using something like: (cat passwd.files/[a-z]*) passwd; make passwd which, of course, could be added to the Makefile as part of the passwd target. (darren) -- OCCAM'S ERASER: The philosophical principle that even the simplest solution is bound to have something wrong with it.
unsubscribe bclark2@banknorth.com
unsubscribe [EMAIL PROTECTED]
nonempty subject
unsubscribe [EMAIL PROTECTED]
Re: schedule server possible?
But I will need a thread that processes the backend stuff, such as maintaining the database and message queue (more like a cron). Is this configuration possible? You can do this now. We rely on cron to kick off the job, but all the business logic is in Apache/mod_perl. The advantage of using cron is that it has rich support for scheduling. Rob
RE: schedule server possible?
You can do this now. We rely on cron to kick off the job, but all the business logic is in Apache/mod_perl. How do you use cron to do scheduling, yet calls Apache/mod_perl to do the processing? Consider cron does not exist in Win32, maybe an all-Apache solution will be simpler and more elegant!? --Steve -- Notice: This e-mail message, together with any attachments, contains information of Merck Co., Inc. (Whitehouse Station, New Jersey, USA) that may be confidential, proprietary copyrighted and/or legally privileged, and is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please immediately return this by e-mail and then delete it. ==
Re: schedule server possible?
Lihn, Steve wrote: How do you use cron to do scheduling, yet calls Apache/mod_perl to do the processing? Your cron script just uses LWP to call a module running in mod_perl. Consider cron does not exist in Win32, maybe an all-Apache solution will be simpler and more elegant!? Cron does exist on Win32. It's called Scheduled Tasks. I use it all the time to kick off perl scripts. - Perrin
schedule server possible?
Hi, The Apache 2 Connection handler opens up the possibility of using it for all kinds of protocol servers. However, I have a wild question: Is it possible to use Apache mod_perl for a schedule server? I.e., a server that is self existent. For example, I can use Apache 2 for Telnet, FTP, SMTP, or even Telephony Server. But I will need a thread that processes the backend stuff, such as maintaining the database and message queue (more like a cron). Is this configuration possible? Steve Lihn FIS Database Support, Merck Co., Inc. Tel: (908) 423 - 4441 -- Notice: This e-mail message, together with any attachments, contains information of Merck Co., Inc. (Whitehouse Station, New Jersey, USA) that may be confidential, proprietary copyrighted and/or legally privileged, and is intended solely for the use of the individual or entity named in this message. If you are not the intended recipient, and have received this message in error, please immediately return this by e-mail and then delete it. ==
Any known gotchas with sending cookies?
I'm really lost with this... I'm trying to set a session cookie from PerlAccessHandler. I'm basically doing (simplified code): my $cookie_jar = Apache::Cookie-fetch my $session_id = $cookie_jar-{ session }-value; if( !session_active( $session_id ) ) { my $session_obj = create_session_obj(); ## create and store new session my $cookie = Apache::Cookie-new( $r, -name= 'session', -value = $session_obj-id, -path= '/', -domain = 'my.domain.com', -expires = '+30m' ); $r-headers_out-add( "Set-Cookie" = $cookie-as_string ); } return DECLINED; This works fine for the first access. Subsequently, I wipe out the backend database to see if a new session is correctly created. A new session is created as expected, but the problem is that the new cookie does not seem to stick to the browser. I've verified that this doesn't seem to be a browser issue, as I have problems with all browsers that I have ( IE5.5, IE6, Mozilla 1.0rc) Is there any known gotchas for this type of thing? Or am I missing something? TIA, --d
Re: Any known gotchas with sending cookies?
hmm, I still can't$B!!(Bget it to work, but it somehow works under LWP. the following code actually gets the cookie correctly, and no bogus sessions are created in my server. any ideas?? use strict; use LWP::UserAgent; my $ua = LWP::UserAgent-new(); $ua-cookie_jar({ file = "$ENV{ HOME }/cookies.txt", autosave = 1 }); my $req = HTTP::Request-new( GET = 'http://foobar.com' ); my $res = $ua-request( $req ); print $res-as_string; print $ua-cookie_jar-as_string, "\n"; --d Daisuke Maki wrote: I'm really lost with this... I'm trying to set a session cookie from PerlAccessHandler. I'm basically doing (simplified code): my $cookie_jar = Apache::Cookie-fetch my $session_id = $cookie_jar-{ session }-value; if( !session_active( $session_id ) ) { my $session_obj = create_session_obj(); ## create and store new session my $cookie = Apache::Cookie-new( $r, -name= 'session', -value = $session_obj-id, -path= '/', -domain = 'my.domain.com', -expires = '+30m' ); $r-headers_out-add( "Set-Cookie" = $cookie-as_string ); } return DECLINED; This works fine for the first access. Subsequently, I wipe out the backend database to see if a new session is correctly created. A new session is created as expected, but the problem is that the new cookie does not seem to stick to the browser. I've verified that this doesn't seem to be a browser issue, as I have problems with all browsers that I have ( IE5.5, IE6, Mozilla 1.0rc) Is there any known gotchas for this type of thing? Or am I missing something? TIA, --d
Re: [Q maybe OT] forward
Hi Perrin, first of all please excuse my late answer - lots of things in mind to care about, as I'm hopefully close to releasing the 0.2 version of the serverpage implementation (and besides I urgently need a new job, too). But thank you for your presice statement, that is exactly what I needed, and you helped me a lot. I think that it'll be a consice offer to the programmer if I declare 'redirect', 'moved', and 'forward' to be events on which enclosing handlers have to react approximately. Which means that a 'redirect' should lead to a 301 response, a 'moved' to a 302, and a 'forward' to whatever. But these are in fact not of the server page's concerns. Would you agree with this approach? regards M Perrin Harkins wrote: Martin Haase-Thomas wrote: forwarding is a term that i borrowed from the JSP concept - which i'm currently trying to implement in perl. JSP forward is directly equivalent to an internal redirect. It's just an include that doesn't return. In short, it's a GOTO statement. Thank you Sun. - Perrin -- Constant shallowness leads to evil. --- Martin Haase-Thomas |Tel.: +49 30 43730-558 Software Development| [EMAIL PROTECTED] ---