Ryan Waples wrote:
I count only 19 lines.
yep, you are right. My bad, I think I missing copy/pasting line 20.
The first group has only three lines. See below.
Not so, the first group is actually the first four lines listed below.
Lines 1-4 serve as one group. For what it is worth, line
Alan Gauld wrote:
On 19/07/12 07:00, Steven D'Aprano wrote:
for reads, lines in four_lines( INFILE ):
ID_Line_1, Seq_Line, ID_Line_2, Quality_Line = lines
Shouldn't that be
for reads, lines in enumerate( four_lines(INFILE) ):
ID_Line_1,
On Wed, Jul 18, 2012 at 04:33:20PM -0700, Ryan Waples wrote:
I've included 20 consecutive lines of input and output. Each of these
5 'records' should have been selected and printed to the output file.
I count only 19 lines. The first group has only three lines. See below.
There is a blank
If you copy those files to a different device (one that has just been
scrubbed and reformatted), then copy them back and get different results
with your application, you've found your problem.
-Bill
Thanks for the insistence, I'll check this out. If you have any
guidance on how to do
I count only 19 lines.
yep, you are right. My bad, I think I missing copy/pasting line 20.
The first group has only three lines. See below.
Not so, the first group is actually the first four lines listed below.
Lines 1-4 serve as one group. For what it is worth, line four should
have 1
On 19/07/12 07:00, Steven D'Aprano wrote:
def four_lines(file_object):
snipping
line1 = next(file_object).strip()
# Get the next three lines, padding if needed.
line2 = next(file_object, '').strip()
line3 = next(file_object, '').strip()
Just a few notes...
On Wed, 18 Jul 2012, Ryan Waples wrote:
snip
import glob
my_in_files = glob.glob ('E:/PINK/Paired_End/raw/gzip/*.fastq')
for each in my_in_files:
#print(each)
out = each.replace('/gzip', '/rem_clusters2' )
#print (out)
INFILE = open (each,
I'm seeing some unexpected output when I use a script (included at
end) to iterate over large text files. I am unsure of the source of
the unexpected output and any help would be much appreciated.
Background
Python v 2.7.1
Windows 7 32bit
Reading and writing to an external USB hard drive
Data
On Wed, Jul 18, 2012 at 04:33:20PM -0700, Ryan Waples wrote:
I'm seeing some unexpected output when I use a script (included at
end) to iterate over large text files. I am unsure of the source of
the unexpected output and any help would be much appreciated.
It may help if you can simplify
Hi Ryan
One quick comment
I dint get through all your code to figure out the fine details but
my hunch is you might be having issues related to linux to dos EOF
char. Could you check the total number of lines in your fastq# are
same as read by a simple python file iterator. If not then it is
On Jul 18, 2012, at 7:33 PM, Ryan Waples wrote:
I'm seeing some unexpected output when I use a script (included at
end) to iterate over large text files. I am unsure of the source of
the unexpected output and any help would be much appreciated.
Background
Python v 2.7.1
Windows 7 32bit
Thanks for the replies, I'll try to address the questions raised and
spur further conversation.
those numbers (4GB and 64M lines) look suspiciously close to the file and
record pointer limits to a 32-bit file system. Are you sure you aren't
bumping into wrap around issues of some sort?
My
On Jul 18, 2012, at 10:33 PM, Ryan Waples wrote:
Thanks for the replies, I'll try to address the questions raised and
spur further conversation.
those numbers (4GB and 64M lines) look suspiciously close to the file and
record pointer limits to a 32-bit file system. Are you sure you aren't
grep ^TTCTGTGAGTGATTTCCTGCAAGACAGGAATGTCAGT$ with no results
How about:
grep TTCTGTGAGTGATTTCCTGCAAGACAGGAATGTCAGT outfile
Just in case there is some non-printing character in there...
Beyond that ... my guess would be that you are either not readingthe file you
think you are, or not
On Wed, Jul 18, 2012 at 8:04 PM, William R. Wing (Bill Wing)
w...@mac.com wrote:
On Jul 18, 2012, at 10:33 PM, Ryan Waples wrote:
Thanks for the replies, I'll try to address the questions raised and
spur further conversation.
those numbers (4GB and 64M lines) look suspiciously close to the
On Wed, Jul 18, 2012 at 8:23 PM, Lee Harr miss...@hotmail.com wrote:
grep ^TTCTGTGAGTGATTTCCTGCAAGACAGGAATGTCAGT$ with no results
How about:
grep TTCTGTGAGTGATTTCCTGCAAGACAGGAATGTCAGT outfile
Just in case there is some non-printing character in there...
There are many instances of that
16 matches
Mail list logo