I'm trying to read in and parse an ascii type file that contains
information that can span several lines.
Example:
createNode animCurveTU -n test:master_globalSmooth;
setAttr .tan 9;
setAttr -s 4 .ktv[0:3] 101 0 163 0 169 0 201 0;
setAttr -s 4 .kit[3] 10;
setAttr -s 4 .kot[3] 10;
Slurp the entire file into a string and pick out the fields you need.
Sent from my iPhone 4.
On Jul 21, 2010, at 10:42 AM, Brandon Harris brandon.har...@reelfx.com wrote:
I'm trying to read in and parse an ascii type file that contains information
that can span several lines.
Example:
what do you mean by slurp the entire file?
I'm trying to use regular expressions because line by line parsing will
be too slow. And example file would have somewhere in the realm of 6
million lines of code.
Brandon L. Harris
Rodrick Brown wrote:
Slurp the entire file into a string and pick
On Wed, Jul 21, 2010 at 8:12 PM, Brandon Harris
brandon.har...@reelfx.comwrote:
I'm trying to read in and parse an ascii type file that contains
information that can span several lines.
Do you have to use only regex? If not, I'd certainly suggest 'pyparsing'.
It's a pleasure to use and very
At the moment I'm trying to stick with built in python modules to create
tools for a much larger pipeline on multiple OSes.
Brandon L. Harris
Eknath Venkataramani wrote:
On Wed, Jul 21, 2010 at 8:12 PM, Brandon Harris
brandon.har...@reelfx.com mailto:brandon.har...@reelfx.com wrote:
I'm trying to read in and parse an ascii type file that contains
information that can span several lines.
Example:
createNode animCurveTU -n test:master_globalSmooth;
setAttr .tan 9;
setAttr -s 4 .ktv[0:3] 101 0 163 0 169 0 201 0;
setAttr -s 4 .kit[3] 10;
setAttr -s 4
Brandon Harris wrote:
I'm trying to read in and parse an ascii type file that contains
information that can span several lines.
Example:
createNode animCurveTU -n test:master_globalSmooth;
setAttr .tan 9;
setAttr -s 4 .ktv[0:3] 101 0 163 0 169 0 201 0;
setAttr -s 4 .kit[3]
I could make it that simple, but that is also incredibly slow and on a
file with several million lines, it takes somewhere in the league of
half an hour to grab all the data. I need this to grab data from many
many file and return the data quickly.
Brandon L. Harris
Andreas Tawn wrote:
I'm
I could make it that simple, but that is also incredibly slow and on a
file with several million lines, it takes somewhere in the league of
half an hour to grab all the data. I need this to grab data from many
many file and return the data quickly.
Brandon L. Harris
That's surprising.
I
Could it be that there isn't just that type of data in the file? there
are many different types, that is just one that I'm trying to grab.
Brandon L. Harris
Andreas Tawn wrote:
I could make it that simple, but that is also incredibly slow and on a
file with several million lines, it takes
I could make it that simple, but that is also incredibly slow and on
a file with several million lines, it takes somewhere in the league of
half an hour to grab all the data. I need this to grab data from
many many file and return the data quickly.
Brandon L. Harris
That's surprising.
I
Brandon Harris wrote:
I'm trying to read in and parse an ascii type file that contains
information that can span several lines.
Example:
What about something like this (you need re.MULTILINE):
In [16]: re.findall('^([^ ].*\n([ ].*\n)+)', a, re.MULTILINE)
Out[16]:
[('createNode animCurveTU
On Wed, 21 Jul 2010 10:06:14 -0500, Brandon Harris wrote:
what do you mean by slurp the entire file? I'm trying to use regular
expressions because line by line parsing will be too slow. And example
file would have somewhere in the realm of 6 million lines of code.
And you think trying to run
Hey Folks,
I've got some info in a bunch of files that kind of looks like so:
Gibberish
53
MoreGarbage
12
RelevantInfo1
10/10/04
NothingImportant
ThisDoesNotMatter
44
RelevantInfo2
22
BlahBlah
343
RelevantInfo3
23
Hubris
Crap
34
and so on...
Anyhow, these fields repeat several times in a given
Yatima wrote:
Hey Folks,
I've got some info in a bunch of files that kind of looks like so:
Gibberish
53
MoreGarbage
12
RelevantInfo1
10/10/04
NothingImportant
ThisDoesNotMatter
44
RelevantInfo2
22
BlahBlah
343
RelevantInfo3
23
Hubris
Crap
34
and so on...
Anyhow, these fields repeat several times
Yatima wrote:
Hey Folks,
I've got some info in a bunch of files that kind of looks like so:
Gibberish
53
MoreGarbage
12
RelevantInfo1
10/10/04
NothingImportant
ThisDoesNotMatter
44
RelevantInfo2
22
BlahBlah
343
RelevantInfo3
23
Hubris
Crap
34
and so on...
Anyhow, these fields repeat several times
On Thu, 03 Mar 2005 09:54:02 -0700, Steven Bethard [EMAIL PROTECTED] wrote:
A possible solution, using the re module:
py s = \
... Gibberish
... 53
... MoreGarbage
... 12
... RelevantInfo1
... 10/10/04
... NothingImportant
... ThisDoesNotMatter
... 44
... RelevantInfo2
... 22
...
Have a look at martel, part of biopython. The world of bioinformatics is
filled with files with structure like this.
http://www.biopython.org/docs/api/public/Martel-module.html
James
On Thursday 03 March 2005 12:03 pm, Yatima wrote:
On Thu, 03 Mar 2005 09:54:02 -0700, Steven Bethard
[EMAIL
On Thu, 03 Mar 2005 07:14:50 -0500, Kent Johnson [EMAIL PROTECTED] wrote:
Here is a way to create a list of [RelevantInfo, value] pairs:
import cStringIO
raw_data = '''Gibberish
53
MoreGarbage
12
RelevantInfo1
10/10/04
NothingImportant
ThisDoesNotMatter
44
RelevantInfo2
22
BlahBlah
I found the original paper for Martel:
http://www.dalkescientific.com/Martel/ipc9/
On Thursday 03 March 2005 12:26 pm, James Stroud wrote:
Have a look at martel, part of biopython. The world of bioinformatics is
filled with files with structure like this.
Yatima wrote:
On Thu, 03 Mar 2005 09:54:02 -0700, Steven Bethard [EMAIL PROTECTED] wrote:
A possible solution, using the re module:
py s = \
... Gibberish
... 53
... MoreGarbage
... 12
... RelevantInfo1
... 10/10/04
... NothingImportant
... ThisDoesNotMatter
... 44
... RelevantInfo2
... 22
...
Here is another attempt. I'm still not sure I understand what form you want the data in. I made a
dict - dict - list structure so if you lookup e.g. scores['10/11/04']['60'] you get a list of all
the RelevantInfo2 values for Relevant1='10/11/04' and Relevant2='60'.
The parser is a simple-minded
On Thu, 03 Mar 2005 16:25:39 -0500, Kent Johnson [EMAIL PROTECTED] wrote:
Here is another attempt. I'm still not sure I understand what form you want
the data in. I made a
dict - dict - list structure so if you lookup e.g. scores['10/11/04']['60']
you get a list of all
the RelevantInfo2
On Thu, 3 Mar 2005 12:26:37 -0800, James Stroud [EMAIL PROTECTED] wrote:
Have a look at martel, part of biopython. The world of bioinformatics is
filled with files with structure like this.
http://www.biopython.org/docs/api/public/Martel-module.html
James
Thanks for the link. Steve and
Kent Johnson wrote:
for line in raw_data:
if line.startswith('RelevantInfo1'):
info1 = raw_data.next().strip()
elif line.startswith('RelevantInfo2'):
info2 = raw_data.next().strip()
elif line.startswith('RelevantInfo3'):
info3 = raw_data.next().strip()
Steven Bethard wrote:
Kent Johnson wrote:
for line in raw_data:
if line.startswith('RelevantInfo1'):
info1 = raw_data.next().strip()
elif line.startswith('RelevantInfo2'):
info2 = raw_data.next().strip()
elif line.startswith('RelevantInfo3'):
info3 =
26 matches
Mail list logo