On Wed, Jul 18, 2012 at 8:04 PM, William R. Wing (Bill Wing)
<w...@mac.com> wrote:
> On Jul 18, 2012, at 10:33 PM, Ryan Waples wrote:
>
>> Thanks for the replies, I'll try to address the questions raised and
>> spur further conversation.
>>
>>> "those numbers (4GB and 64M lines) look suspiciously close to the file and 
>>> record pointer limits to a 32-bit file system.  Are you sure you aren't 
>>> bumping into wrap around issues of some sort?"
>>
>> My understanding is that I am taking the files in a stream, one line
>> at a time and never loading them into memory all at once.  I would
>> like (and expect) my script to be able to handle files up to at least
>> 50GB.  If this would cause a problem, let me know.
>
> [Again, stripping out everything else…]
>
> I don't think you understood my concern.  The issue isn't whether or not the 
> files are being read as a stream, the issue is that at something like those 
> numbers a 32-bit file system can silently fail.  If the pointers that are 
> chaining allocation blocks together (or whatever Windows calls them) aren't 
> capable of indexing to sufficiently large numbers, then you WILL get garbage 
> included in the file stream.
>
> If you copy those files to a different device (one that has just been 
> scrubbed and reformatted), then copy them back and get different results with 
> your application, you've found your problem.
>
> -Bill

Thanks for the insistence,  I'll check this out.  If you have any
guidance on how to do so let me know.  I knew my system wasn't
particularly well suited to the task at hand, but I haven't seen how
it would actually cause problems.

-Ryan
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to