On Dec 21, 2:01 am, Alexander Kapps alex.ka...@web.de wrote:
On 20.12.2011 22:04, Nick Dokos wrote:
I have a text file containing such data ;
A B C
---
-2.0100e-01 8.000e-02
Hi all,
I have a text file containing such data ;
ABC
---
-2.0100e-018.000e-028.000e-05
-2.e-010.000e+00 4.800e-04
-1.9900e-014.000e-021.600e-04
But I only need Section B, and I
On 12/20/2011 02:17 PM, Yigit Turgut wrote:
Hi all,
I have a text file containing such data ;
ABC
---
-2.0100e-018.000e-028.000e-05
-2.e-010.000e+00 4.800e-04
-1.9900e-014.000e-02
Tue, 20 Dec 2011 11:17:15 -0800 (PST)
Yigit Turgut a écrit:
Hi all,
I have a text file containing such data ;
ABC
---
-2.0100e-018.000e-028.000e-05
-2.e-010.000e+00 4.800e-04
Jérôme jer...@jolimont.fr wrote:
Tue, 20 Dec 2011 11:17:15 -0800 (PST)
Yigit Turgut a écrit:
Hi all,
I have a text file containing such data ;
ABC
---
-2.0100e-018.000e-02
On 20.12.2011 22:04, Nick Dokos wrote:
I have a text file containing such data ;
ABC
---
-2.0100e-018.000e-028.000e-05
-2.e-010.000e+00 4.800e-04
-1.9900e-014.000e-021.600e-04
On Mon, Jul 4, 2011 at 12:36 AM, Xah Lee xah...@gmail.com wrote:
So, a solution by regex is out.
Actually, none of the complications you listed appear to exclude
regexes. Here's a possible (untested) solution:
div class=img
((?:\s*img src=[^.]+\.(?:jpg|png|gif) alt=[^]+ width=[0-9]+
On Jul 4, 12:13 pm, S.Mandl stefanma...@web.de wrote:
Nice. I guess that XSLT would be another (the official) approach for
such a task.
Is there an XSLT-engine for Emacs?
-- Stefan
haven't used XSLT, and don't know if there's one in emacs...
it'd be nice if someone actually give a
On Jul 5, 12:17 pm, Ian Kelly ian.g.ke...@gmail.com wrote:
On Mon, Jul 4, 2011 at 12:36 AM, Xah Lee xah...@gmail.com wrote:
So, a solution by regex is out.
Actually, none of the complications you listed appear to exclude
regexes. Here's a possible (untested) solution:
div class=img
On Jul 5, 12:17 pm, Ian Kelly ian.g.ke...@gmail.com wrote:
On Mon, Jul 4, 2011 at 12:36 AM, Xah Lee xah...@gmail.com wrote:
So, a solution by regex is out.
Actually, none of the complications you listed appear to exclude
regexes. Here's a possible (untested) solution:
div class=img
On Tue, Jul 5, 2011 at 2:37 PM, Xah Lee xah...@gmail.com wrote:
but in anycase, i can't see how this part would work
p class=cpt((?:[^]|(?!/p))+)/p
It's not that different from the pattern 「alt=[^]+」 earlier in the
regex. The capture group accepts one or more characters that either
aren't '',
haven't used XSLT, and don't know if there's one in emacs...
it'd be nice if someone actually give a example...
Hi Xah, actually I have to correct myself. HTML is not XML. If it
were, you
could use a stylesheet like this:
?xml version=1.0 encoding=ISO-8859-1?
xsl:stylesheet version=1.0
.
--
Emacs Lisp: Processing HTML: Transform Tags to HTML5 “figure” and
“figcaption” Tags
Xah Lee, 2011-07-03
Another triumph of using elisp for text processing over perl/python.
The Problem
--
Summary
I want batch transform
Nice. I guess that XSLT would be another (the official) approach for
such a task.
Is there an XSLT-engine for Emacs?
-- Stefan
--
http://mail.python.org/mailman/listinfo/python-list
Is text processing with dicts a good use case for Python
cross-compilers like Cython/Pyrex or ShedSkin? (I've read the
cross compiler claims about massive increases in pure numeric
performance).
I have 3 use cases I'm considering for Python-to-C++
cross-compilers for generating 32-bit Python
pyt...@bdurham.com, 16.12.2010 21:03:
Is text processing with dicts a good use case for Python
cross-compilers like Cython/Pyrex or ShedSkin? (I've read the
cross compiler claims about massive increases in pure numeric
performance).
Cython is generally a good choice for string processing
New to Python. I can solve the problem in perl by using split() to
an array. Can't figure it out in Python.
I'm reading variable lines of text. I want to use the first number I
find. The problem is the lines are variable.
Input example:
this is a number: 1
here are some numbers 1 2 3 4
On Thu, Sep 10, 2009 at 11:36 AM, AJAskey aske...@gmail.com wrote:
New to Python. I can solve the problem in perl by using split() to
an array. Can't figure it out in Python.
I'm reading variable lines of text. I want to use the first number I
find. The problem is the lines are variable.
Never mind. I guess I had been trying to make it more difficult than
it is. As a note, I can work on something for 10 hours and not figure
it out. But the second I post to a group, then I immediately figure
it out myself. Strange snake this Python...
Example for anyone else interested:
line =
Thanks Black Jack
Working
--
http://mail.python.org/mailman/listinfo/python-list
I have string like follow
12560/ABC,12567/BC,123,567,890/JK
I want above string to group like as follow
(12560,ABC)
(12567,BC)
(123,567,890,JK)
i try regular expression i am able to get first two not the third one.
can regular expression given data in different groups
--
On Thu, 25 Sep 2008 15:51:28 +0100, [EMAIL PROTECTED] wrote:
I have string like follow
12560/ABC,12567/BC,123,567,890/JK
I want above string to group like as follow (12560,ABC)
(12567,BC)
(123,567,890,JK)
i try regular expression i am able to get first two not the third one.
can
You can do it with regexps too :
--
import re
to_watch = re.compile(r(?Pnumber\d+)[/](?Pletter[A-Z]+))
final_list = to_watch.findall(12560/ABC,12567/BC,123,567,890/JK)
for number,word in final_list :
print number:%s -- word:
On Sep 25, 6:34 pm, Marc 'BlackJack' Rintsch [EMAIL PROTECTED] wrote:
On Thu, 25 Sep 2008 15:51:28 +0100, [EMAIL PROTECTED] wrote:
I have string like follow
12560/ABC,12567/BC,123,567,890/JK
I want above string to group like as follow (12560,ABC)
(12567,BC)
(123,567,890,JK)
i try
On Sep 25, 9:51 am, [EMAIL PROTECTED] [EMAIL PROTECTED]
wrote:
I have string like follow
12560/ABC,12567/BC,123,567,890/JK
I want above string to group like as follow
(12560,ABC)
(12567,BC)
(123,567,890,JK)
i try regular expression i am able to get first two not the third one.
can
Text Processing with Emacs Lisp
Xah Lee, 2007-10-29
This page gives a outline of how to use emacs lisp to do text
processing, using a specific real-world problem as example. If you
don't know elisp, first take a gander at Emacs Lisp Basics.
HTML version with links and colors is at:
http
... continued from previous post.
PS I'm cross-posting this post to perl and python groups because i
find that it being a little know fact that emacs lisp's power in the
area of text processing, are far beyond Perl (or Python).
... i worked as a professional perl programer since 1998. I started
[EMAIL PROTECTED] wrote:
And now for something completely different...
I've been reading up a bit about Python and Excel and I quickly told
the program to output to Excel quite easily. However, what if the
input file were a Word document? I can't seem to find much
information about parsing
patrick.waldo wrote:
manipulation? Also, I conceptually get it, but would you mind walking
me through
for key, group in groupby(instream, unicode.isspace):
if not key:
yield .join(group)
itertools.groupby() splits a sequence into groups with the same key; e. g.
to
And now for something completely different...
I see a lot of COM stuff with Python for excel...and I quickly made
the same program output to excel. What if the input file were a Word
document? Where is there information about manipulating word
documents, or what could I add to make the same
And now for something completely different...
I've been reading up a bit about Python and Excel and I quickly told
the program to output to Excel quite easily. However, what if the
input file were a Word document? I can't seem to find much
information about parsing Word files. What could I add
lines = open('your_file.txt').readlines()[:4]
print lines
print map(len, lines)
gave me:
['\xef\xbb\xbf200-720-769-93-2\n', 'kyselina mo\xc4\x8dov
\xc3\xa1 C5H4N4O3\n', '\n', '200-001-8\t50-00-0\n']
[28, 32, 1, 18]
I think it means that I'm still at option 3. I got
lines = open('your_file.txt').readlines()[:4]
print lines
print map(len, lines)
gave me:
['\xef\xbb\xbf200-720-769-93-2\n', 'kyselina mo\xc4\x8dov
\xc3\xa1 C5H4N4O3\n', '\n', '200-001-8\t50-00-0\n']
[28, 32, 1, 18]
I think it means that I'm still at option 3. I got
On Mon, 15 Oct 2007 10:47:16 +, patrick.waldo wrote:
my sample input file looks like this( not organized,as you see it):
200-720-769-93-2
kyselina mocová C5H4N4O3
200-001-8 50-00-0
formaldehyd CH2O
200-002-3
50-01-1
guanidínium-chlorid CH5N3.ClH
On Oct 15, 12:20 pm, Marc 'BlackJack' Rintsch [EMAIL PROTECTED] wrote:
On Mon, 15 Oct 2007 10:47:16 +, patrick.waldo wrote:
my sample input file looks like this( not organized,as you see it):
200-720-769-93-2
kyselina mocová C5H4N4O3
200-001-8 50-00-0
formaldehyd
patrick.waldo wrote:
my sample input file looks like this( not organized,as you see it):
200-720-769-93-2
kyselina mocová C5H4N4O3
200-001-8 50-00-0
formaldehyd CH2O
200-002-3
50-01-1
guanidínium-chlorid CH5N3.ClH
Assuming that the records are always
Wow, thank you all. All three work. To output correctly I needed to
add:
output.write(\r\n)
This is really a great help!!
Because of my limited Python knowledge, I will need to try to figure
out exactly how they work for future text manipulation and for my own
knowledge. Could you recommend
On Oct 15, 10:08 pm, [EMAIL PROTECTED] wrote:
Because of my limited Python knowledge, I will need to try to figure
out exactly how they work for future text manipulation and for my own
knowledge. Could you recommend some resources for this kind of text
manipulation? Also, I conceptually get
On Oct 14, 8:48 am, [EMAIL PROTECTED] wrote:
Hi all,
I started Python just a little while ago and I am stuck on something
that is really simple, but I just can't figure out.
Essentially I need to take a text document with some chemical
information in Czech and organize it into another text
Hi all,
I started Python just a little while ago and I am stuck on something
that is really simple, but I just can't figure out.
Essentially I need to take a text document with some chemical
information in Czech and organize it into another text file. The
information is always EINECS number,
On Sun, 14 Oct 2007 13:48:51 +, patrick.waldo wrote:
Essentially I need to take a text document with some chemical
information in Czech and organize it into another text file. The
information is always EINECS number, CAS, chemical name, and formula
in tables. I need to organize them
On Oct 14, 2:48 pm, [EMAIL PROTECTED] wrote:
Hi all,
I started Python just a little while ago and I am stuck on something
that is really simple, but I just can't figure out.
Essentially I need to take a text document with some chemical
information in Czech and organize it into another text
Thank you both for helping me out. I am still rather new to Python
and so I'm probably trying to reinvent the wheel here.
When I try to do Paul's response, I get
tokens = line.strip().split()
[]
So I am not quite sure how to read line by line.
tokens = input.read().split() gets me all the
On Sun, 14 Oct 2007 16:57:06 +, patrick.waldo wrote:
Thank you both for helping me out. I am still rather new to Python
and so I'm probably trying to reinvent the wheel here.
When I try to do Paul's response, I get
tokens = line.strip().split()
[]
What is in `line`? Paul wrote this
On Oct 14, 11:48 pm, [EMAIL PROTECTED] wrote:
Hi all,
I started Python just a little while ago and I am stuck on something
that is really simple, but I just can't figure out.
Essentially I need to take a text document with some chemical
information in Czech and organize it into another text
On Sep 7, 3:50 am, George Sakkis [EMAIL PROTECTED] wrote:
On Sep 5, 5:17 pm, [EMAIL PROTECTED] [EMAIL PROTECTED]
wrote:
If this was a code golf challenge,
I'd choose the Unix split solution and be both maintainable as well as
concise :-)
- Paddy.
--
Thanks for making me aware of the (UNIX) split command (split -l 5
inFile.txt), it's short, it's fast, it's beautiful.
I am still wondering how to do this efficiently in Python (being kind
of new to it... and it's not for homework).
Something like this should do the job:
def nlines(num,
[EMAIL PROTECTED] escribió:
I am still wondering how to do this efficiently in Python (being kind
of new to it... and it's not for homework).
You should post some code anyway, it would be easier to give useful advice (it
would also demonstrate that you put some effort on it).
Anyway, here is
Here's my solution, for what it's worth:
#!/usr/bin/env python
import os
input = open(test.txt, r)
counter = 0
fileNum = 0
fileName =
def newFileName():
global fileNum, fileName
while os.path.exists(fileName) or fileName == :
fileNum += 1
x = %0.5d % fileNum
On Sep 5, 5:17 pm, [EMAIL PROTECTED] [EMAIL PROTECTED]
wrote:
On Sep 5, 1:28 pm, Paddy [EMAIL PROTECTED] wrote:
On Sep 5, 5:13 pm, [EMAIL PROTECTED] [EMAIL PROTECTED]
wrote:
I have a text source file of about 20.000 lines.From this file, I like
to write the first 5 lines to a new
Shawn Milochik wrote:
On 9/5/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
I have a text source file of about 20.000 lines.
From this file, I like to write the first 5 lines to a new file. Close
that file, grab the next 5 lines write these to a new file... grabbing
5 lines and creating new
I have a text source file of about 20.000 lines.
From this file, I like to write the first 5 lines to a new file. Close
that file, grab the next 5 lines write these to a new file... grabbing
5 lines and creating new files until processing of all 20.000 lines is
done.
Is there an efficient way to
On Sep 5, 11:13 am, [EMAIL PROTECTED] [EMAIL PROTECTED]
wrote:
I have a text source file of about 20.000 lines.From this file, I like to
write the first 5 lines to a new file. Close
that file, grab the next 5 lines write these to a new file... grabbing
5 lines and creating new files until
[EMAIL PROTECTED] escribió:
I have a text source file of about 20.000 lines.
From this file, I like to write the first 5 lines to a new file. Close
that file, grab the next 5 lines write these to a new file... grabbing
5 lines and creating new files until processing of all 20.000 lines is
[EMAIL PROTECTED] wrote:
I would use a counter in a for loop using the readline method to
iterate over the 20,000 line file.
file objects are iterables themselves, so there's no need to do that
by using a method.
Reset the counter every 5 lines/ iterations and close the file.
I'd use a
On 9/5/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
I have a text source file of about 20.000 lines.
From this file, I like to write the first 5 lines to a new file. Close
that file, grab the next 5 lines write these to a new file... grabbing
5 lines and creating new files until processing
On Sep 5, 11:57 am, Bjoern Schliessmann usenet-
[EMAIL PROTECTED] wrote:
[EMAIL PROTECTED] wrote:
I would use a counter in a for loop using the readline method to
iterate over the 20,000 line file.
file objects are iterables themselves, so there's no need to do that
by using a method.
[EMAIL PROTECTED] wrote:
I have a text source file of about 20.000 lines.
From this file, I like to write the first 5 lines to a new file. Close
that file, grab the next 5 lines write these to a new file... grabbing
5 lines and creating new files until processing of all 20.000 lines is
done.
On Sep 5, 5:13 pm, [EMAIL PROTECTED] [EMAIL PROTECTED]
wrote:
I have a text source file of about 20.000 lines.From this file, I like to
write the first 5 lines to a new file. Close
that file, grab the next 5 lines write these to a new file... grabbing
5 lines and creating new files until
On Sep 5, 1:28 pm, Paddy [EMAIL PROTECTED] wrote:
On Sep 5, 5:13 pm, [EMAIL PROTECTED] [EMAIL PROTECTED]
wrote:
I have a text source file of about 20.000 lines.From this file, I like to
write the first 5 lines to a new file. Close
that file, grab the next 5 lines write these to a new
On Sep 5, 5:13 pm, [EMAIL PROTECTED] [EMAIL PROTECTED]
wrote:
I have a text source file of about 20.000 lines.From this file, I like to
write the first 5 lines to a new file. Close
that file, grab the next 5 lines write these to a new file... grabbing
5 lines and creating new files until
Arnaud Delobelle wrote:
[...]
from my_useful_functions import new_file, write_first_5_lines,
done_processing_file, grab_next_5_lines, another_new_file, write_these
in_f = open('myfile')
out_f = new_file()
write_first_5_lines(in_f, out_f) # write first 5 lines
close(out_f)
while not
can
parse lines from read buffer freely.
have fun!
- Original Message -
From: Shawn Milochik [EMAIL PROTECTED]
To: python-list@python.org
Sent: Thursday, September 06, 2007 1:03 AM
Subject: Re: Text processing and file creation
On 9/5/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
I
On Sep 6, 12:46 am, Steve Holden [EMAIL PROTECTED] wrote:
Arnaud Delobelle wrote:
[...]
print all done! # All done
print Now there are 4000 files in this directory...
Python 3.0 - ready (I've used open() instead of file())
bzzt!
Python 3.0a1 (py3k:57844, Aug 31 2007, 16:54:27)
I'm in a process of rewriting a bash/awk/sed script -- that grew to
big -- in python. I can rewrite it in a simple line-by-line way but
that results in ugly python code and I'm sure there is a simple
pythonic way.
The bash script processed text files of the form:
Hi list,
I'm in a process of rewriting a bash/awk/sed script -- that grew to
big -- in python. I can rewrite it in a simple line-by-line way but
that results in ugly python code and I'm sure there is a simple
pythonic way.
The bash script processed text files of the form:
Daniel Nogradi:
Any elegant solution for this?
This is my first try:
ddata = {}
inside_matrix = False
for row in file(data.txt):
if row.strip():
fields = row.split()
if len(fields) == 2:
inside_matrix = False
ddata[fields[0]] = [fields[1]]
This is my first try:
ddata = {}
inside_matrix = False
for row in file(data.txt):
if row.strip():
fields = row.split()
if len(fields) == 2:
inside_matrix = False
ddata[fields[0]] = [fields[1]]
lastkey = fields[0]
else:
On Mar 23, 10:30 pm, Daniel Nogradi [EMAIL PROTECTED] wrote:
Hi list,
I'm in a process of rewriting a bash/awk/sed script -- that grew to
big -- in python. I can rewrite it in a simple line-by-line way but
that results in ugly python code and I'm sure there is a simple
pythonic way.
The
On Mar 23, 5:30 pm, Daniel Nogradi [EMAIL PROTECTED] wrote:
Hi list,
I'm in a process of rewriting a bash/awk/sed script -- that grew to
big -- in python. I can rewrite it in a simple line-by-line way but
that results in ugly python code and I'm sure there is a simple
pythonic way.
The
I have a pair of python programs that parse and index files on my computer
to make them searchable. The problem that I have is that they continually
grow until my system is out of memory, and then things get ugly. I
remember, when I was first learning python, reading that the python
interpreter
After reading
http://www.python.org/doc/faq/general/#how-does-python-manage-memory, I
tried modifying this program as below:
a=[]
for i in xrange(33,127):
for j in xrange(33,127):
for k in xrange(33,127):
for l in xrange(33, 127):
a.append(chr(i)+chr(j)+chr(k)+chr(l))
import sys
On 1/8/07, tsuraan [EMAIL PROTECTED] wrote:
[snip]
The loop is deep enough that I always interrupt it once python's size is
around 250 MB. Once the gc.collect() call is finished, python's size has
not changed a bit.
[snip]
This has been tried under python 2.4.3 in gentoo linux and python 2.3
I just tried on my system
(Python is using 2.9 MiB)
a = ['a' * (1 20) for i in xrange(300)]
(Python is using 304.1 MiB)
del a
(Python is using 2.9 MiB -- as before)
And I didn't even need to tell the garbage collector to do its job. Some
info:
It looks like the big difference between our
On 1/8/07, tsuraan [EMAIL PROTECTED] wrote:
I just tried on my system
(Python is using 2.9 MiB)
a = ['a' * (1 20) for i in xrange(300)]
(Python is using 304.1 MiB)
del a
(Python is using 2.9 MiB -- as before)
And I didn't even need to tell the garbage collector to do its
On 1/8/07, Felipe Almeida Lessa [EMAIL PROTECTED] wrote:
On 1/8/07, tsuraan [EMAIL PROTECTED] wrote:
I just tried on my system
(Python is using 2.9 MiB)
a = ['a' * (1 20) for i in xrange(300)]
(Python is using 304.1 MiB)
del a
(Python is using 2.9 MiB -- as before)
$ python
Python 2.4.4c1 (#2, Oct 11 2006, 21:51:02)
[GCC 4.1.2 20060928 (prerelease) (Ubuntu 4.1.1-13ubuntu5)] on linux2
Type help, copyright, credits or license for more information.
# Python is using 2.7 MiB
... a = ['1234' for i in xrange(10 20)]
# Python is using 42.9 MiB
... del a
#
My first thought was that interned strings were causing the growth,
but that doesn't seem to be the case.
Interned strings, as of 2.3, are no longer immortal, right? The intern doc
says you have to keep a reference around to the string now, anyhow. I
really wish I could find that thing I
On 1/8/07, tsuraan [EMAIL PROTECTED] wrote:
My first thought was that interned strings were causing the growth,
but that doesn't seem to be the case.
Interned strings, as of 2.3, are no longer immortal, right? The intern doc
says you have to keep a reference around to the string now,
I remember something about it coming up in some of the discussions of
free lists and better behavior in this regard in 2.5, but I don't
remember the details.
Under Python 2.5, my original code posting no longer exhibits the bug - upon
calling del(a), python's size shrinks back to ~4 MB, which
I am beginning to use python primarily to organize data into formats
needed for input into some statistical packages. I do not have much
programming experience outside of LaTeX and R, so some of this is a bit
new. I am attempting to write a program that reads in a text file that
contains some
Harold To illustrate, assume I have a text file, call it test.txt, with
Harold the following information:
Harold X11 .32
Harold X22 .45
Harold My goal in the python program is to manipulate this file such
Harold that a new file would be created that looks like:
(I tried to post this yesterday but I think my ISP ate it. Apologies if
this is a double-post.)
Is it possible to do very fast string processing in python? My
bioinformatics application needs to scan very large ASCII files (80GB+),
compare adjacent lines, and conditionally do some further
Alexis Gallagher wrote:
(I tried to post this yesterday but I think my ISP ate it. Apologies if
this is a double-post.)
Is it possible to do very fast string processing in python? My
bioinformatics application needs to scan very large ASCII files (80GB+),
compare adjacent lines, and
Maybe this code will be faster? (If it even does the same thing:
largely untested)
filehandle = open(data,'r',buffering=1000)
fileIter = iter(filehandle)
lastLine = fileIter.next()
lastTokens = lastLine.strip().split(delimiter)
lastGeno = extract(lastTokens[0])
for currentLine in fileIter:
Steve,
First, many thanks!
Steve Holden wrote:
Alexis Gallagher wrote:
filehandle = open(data,'r',buffering=1000)
This buffer size seems, shall we say, unadventurous? It's likely to slow
things down considerably, since the filesystem is probably going to
naturally wnt to use a rather
Alexis Gallagher wrote:
Steve,
First, many thanks!
Steve Holden wrote:
Alexis Gallagher wrote:
filehandle = open(data,'r',buffering=1000)
This buffer size seems, shall we say, unadventurous? It's likely to
slow things down considerably, since the filesystem is probably going
to
Hi,
I'm a total newbie to Python so any and all advice is greatly
appreciated.
I'm trying to use regular expressions to process text in an SGML file
but only in one section.
So the input would look like this:
ch-part no=ItitleRESEARCH GUIDE
sec-main no=1.01titlecontent
paracontent
sec-main
That's how Python works. You read in the whole file, edit it, and
write it back out. As far as I know there's no way to edit a file
in place which I'm assuming is what you're asking?
And now, cue the responses telling you to use a fancy parser (XML?) for your project ;-)
-Greg
On 4 Oct 2005
You can edit a file in place, but it is not applicable to what you are doing.
As soon as you insert the first biblio, you've shifted everything
downstream by those 8 bytes. Since they map to a physically located blocks on
a physical drive, you will have to rewrite those blocks. If it is a big
[EMAIL PROTECTED] writes:
I'm a total newbie to Python so any and all advice is greatly
appreciated.
Well, I've got some for you.
I'm trying to use regular expressions to process text in an SGML file
but only in one section.
This is generally a bad idea. SGML family languages aren't easy to
Gregory Piñero wrote:
That's how Python works. You read in the whole file, edit it, and write it
back out.
that's how file systems work. if file systems generally supported insert
operations, Python would of course support that feature.
/F
--
Even though you are using re's to try to look for specific substrings
(which you sort of fake in by splitting on Identifier, and then
prepending Identifier to every list element, so that the re will
match...), this program has quite a few holes.
What if the word Identifier is inside one of the
Hello pruebauno,
import re
f=file('tlst')
tlst=f.read().split('\n')
f.close()
tlst = open(tlst).readlines()
f=file('plst')
sep=re.compile('Identifier (.*?)')
plst=[]
for elem in f.read().split('Identifier'):
content='Identifier'+elem
match=sep.search(content)
if
Paul McGuire wrote:
match...), this program has quite a few holes.
What if the word Identifier is inside one of the quoted strings?
What if the actual value is tablename10? This will match your
tablename1 string search, but it is certainly not what you want.
Did you know there are trailing
Miki Tebeka wrote:
Look at re.findall, I think it'll be easier.
Minor changes aside the interesting thing, as you pointed out, would be
using re.findall. I could not figure out how to.
--
http://mail.python.org/mailman/listinfo/python-list
[EMAIL PROTECTED] wrote:
Paul McGuire wrote:
match...), this program has quite a few holes.
tried run it though and it is not working for me. The following code
runs but prints nothing at all:
import pyparsing as prs
And this is the point where I have to post the real stuff because your
Yes indeed, the real data often has surprising differences from the
simulations! :)
It turns out that pyparsing LineStart()'s are pretty fussy. Usually,
pyparsing is very forgiving about whitespace between expressions, but
it turns out that LineStart *must* be followed by the next expression,
I am sure there is a better way of writing this, but how?
import re
f=file('tlst')
tlst=f.read().split('\n')
f.close()
f=file('plst')
sep=re.compile('Identifier (.*?)')
plst=[]
for elem in f.read().split('Identifier'):
content='Identifier'+elem
match=sep.search(content)
if
Maurice LING wrote:
Matt wrote:
I'd HIGHLY suggest purchasing the excellent a
href=http://www.oreilly.com/catalog/regex2/index.html;Mastering
Regular Expressions/a by Jeff Friedl. Although it's mostly
geared
towards Perl, it will answer all your questions about regular
expressions.
1 - 100 of 112 matches
Mail list logo