On Monday, October 12, 2015 at 10:02:13 PM UTC+11, Laura Creighton wrote:
> In a message of Sun, 11 Oct 2015 17:56:33 -0700, Victor Hooi writes:
> >Hi,
> >
> >I'm attempting to parse MongoDB loglines.
> >
> >The formatting of these loglines cou
Hi,
I'm attempting to parse MongoDB loglines.
The formatting of these loglines could best be described as JSON-like...
For example - arrays
Anyhow, say I had the following logline snippet:
{ Global: { acquireCount: { r: 2, w: 2 } }, Database: { acquireCount: { w:
2 } }, Collection: {
I'm using Python to parse metrics out of logfiles.
The logfiles are fairly large (multiple GBs), so I'm keen to do this in a
reasonably performant way.
The metrics are being sent to a InfluxDB database - so it's better if I can
batch multiple metrics into a batch ,rather than sending them
On Thursday, 3 September 2015 03:49:05 UTC+10, Terry Reedy wrote:
> On 9/2/2015 6:04 AM, Victor Hooi wrote:
> > I'm using grouper() to iterate over a textfile in groups of lines:
> >
> > def grouper(iterable, n, fillvalue=None):
> > "Collect data in
Hi,
I'm using Python to parse out metrics from logfiles, and ship them off to a
database called InfluxDB, using their Python driver
(https://github.com/influxdb/influxdb-python).
With InfluxDB, it's more efficient if you pack in more points into each message.
Hence, I'm using the grouper()
is
then passing another iterable to enumerate - I'm just not sure of the best way
to get the line numbers from the original iterable f, and pass this through the
chain?
On Wednesday, 2 September 2015 20:37:01 UTC+10, Peter Otten wrote:
> Victor Hooi wrote:
>
> > I'm using grouper() to
I have a function which is meant to return a tuple:
def get_metrics(server_status_json, metrics_to_extract, line_number):
return ((timestamp, "serverstatus", values, tags))
I also have:
def create_point(timestamp, metric_name, values, tags):
return {
I'm using grouper() to iterate over a textfile in groups of lines:
def grouper(iterable, n, fillvalue=None):
"Collect data into fixed-length chunks or blocks"
# grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx
args = [iter(iterable)] * n
return zip_longest(fillvalue=fillvalue, *args)
consistency in MongoDB
is...quirky, shall we say.
Cheers,
Victor
On Friday, 28 August 2015 16:15:21 UTC+10, Jussi Piitulainen wrote:
Ben Finney writes:
Victor Hooi writes:
[- -]
For example:
{
hostname: example.com,
version: 3.0.5,
pid: {
floatApprox: 18403
I'm reading JSON output from an input file, and extracting values.
Many of the fields are meant to be numerical, however, some fields are wrapped
in a floatApprox dict, which messed with my parsing.
For example:
{
hostname: example.com,
version: 3.0.5,
pid: {
floatApprox:
or strings (although strings would not be wrapped in
floatApprox).
On Friday, 28 August 2015 14:58:01 UTC+10, Victor Hooi wrote:
I'm reading JSON output from an input file, and extracting values.
Many of the fields are meant to be numerical, however, some fields are
wrapped in a floatApprox
Hi,
I have a textfile with a bunch of JSON objects, one per line.
I'm looking at parsing each of these, and extract some metrics from each line.
I have a dict called metrics_to_extract, containing the metrics I'm looking
at extracting. In this, I store a name used to identify the metric, along
I have a line that looks like this:
14 *0330 *0 760 411|0 0 770g 1544g 117g 1414
computedshopcartdb:103.5% 0 30|0 0|119m97m 1538
ComputedCartRS PRI 09:40:26
I'd like to split this line on multiple separators - in this case,
On Tuesday, 28 July 2015 23:59:11 UTC+10, m wrote:
W dniu 28.07.2015 o 15:55, Victor Hooi pisze:
I know the regex library also has a split, unfortunately, that does not
collapse consecutive whitespace:
In [19]: re.split(' |', f)
Try ' *\|'
p. m.
Hmm, that seems to be getting
I just want to run some things past you guys, to make sure I'm doing it right.
I'm using Python to parse disk metrics out of iostat output. The device lines
look like this:
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz
avgqu-sz await svctm %util
sda
Aha, cool, that's a good idea =) - it seems I should spend some time getting to
know generators/iterators.
Also, sorry if this is basic, but once I have the block list itself, what is
the best way to parse each relevant line?
In this case, the first line is a timestamp, the next two lines are
Hi,
I'm trying to parse iostat -xt output using Python. The quirk with iostat is
that the output for each second runs over multiple lines. For example:
06/30/2015 03:09:17 PM
avg-cpu: %user %nice %system %iowait %steal %idle
0.030.000.030.000.00 99.94
Device:
Hi,
Aha, yeah, I can add the connection_id as another field in the inner dict - the
only drawback is that the data is duplicated twice. However, I suppose even if
it's not elegant, it does work.
However, that ChainMap does look interesting =). And yes, I am actually using
Python 3.x (mainly
Hi,
I have a dict named connections, with items like the following:
In [18]: connections
Out[18]:
{'3424234': {'end_timestamp': datetime.datetime(2015, 3, 25, 5, 31, 30, 406000,
tzinfo=datetime.timezone(datetime.timedelta(-1, 61200))),
'ip_address': '10.168.8.36:52440',
'open_timestamp':
Hi,
What is the currently most Pythonic way for doing deep comparisons between
dicts?
For example, say you have the following two dictionaries
a = {
'bob': { 'full_name': 'bob jones', 'age': 4, 'hobbies': ['hockey',
'tennis'], 'parents': { 'mother': 'mary', 'father', 'mike'}},
comparisons:
https://pypi.python.org/pypi/datadiff
Cheers,
Victor
On Friday, 20 March 2015 13:33:52 UTC+11, Ben Finney wrote:
Victor Hooi victorh...@gmail.com writes:
What is the currently most Pythonic way for doing deep comparisons
between dicts?
What distinction do you intend by saying
Hi,
I'm running pep8 across my code, and getting warnings about my long lines ( 80
characters).
I'm wonder what's the recommended way to handle the below cases, and fit under
80 characters.
First example - multiple context handlers:
with open(self.full_path, 'r') as input,
:
In this case, the 80-character mark is actually partway through previously
processed files (the first occurrence)...
Cheers,
Victor
On Thursday, 28 November 2013 12:57:13 UTC+11, Victor Hooi wrote:
Hi,
I'm running pep8 across my code, and getting warnings about my long lines (
80
Hi,
I'm trying to use Python's new style string formatting with a dict and string
together.
For example, I have the following dict and string variable:
my_dict = { 'cat': 'ernie', 'dog': 'spot' }
foo = 'lorem ipsum'
If I want to just use the dict, it all works fine:
'{cat} and
Hi,
Ok, this is a topic that I've never really understood properly, so I'd like to
find out what's the proper way of doing things.
Say I have a directory structure like this:
furniture/
__init__.py
chair/
__init__.py
config.yaml
in results:
or is there a better way?
Cheers,
Victor
On Tuesday, 19 November 2013 20:36:47 UTC+11, Mark Lawrence wrote:
On 19/11/2013 07:13, Victor Hooi wrote:
So basically, using exception handling for flow-control.
However, is that considered bad practice, or un-Pythonic
Hi,
I have a script that needs to handle input files of different types
(uncompressed, gzipped etc.).
My question is regarding how I should handle the different cases.
My first thought was to use a try-catch block and attempt to open it using the
most common filetype, then if that failed, try
Hi,
I have a general question regarding try-except handling in Python.
Previously, I was putting the try-handle blocks quite close to where the errors
occured:
A somewhat contrived example:
if __name__ == __main__:
my_pet = Dog('spot', 5, 'brown')
my_pet.feed()
Hi,
I have a Python script that's using a format string without positional
specifiers. I.e.:
LOG_FILENAME =
'my_something_{}.log'.format(datetime.now().strftime('%Y-%d-%m_%H.%M.%S'))
I'm running this from within a virtualenv, running under Python 2.7.3.
$ python -V
Python 2.7.3
4.4.6-4)]
$ sudo python sync_bexdb.py
[sudo] password for victor:
2.6.6 (r266:84292, Jul 10 2013, 22:48:45)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-3)]
Cheers,
Victor
On Tuesday, 5 November 2013 10:02:50 UTC+11, Chris Angelico wrote:
On Tue, Nov 5, 2013 at 9:33 AM, Victor Hooi victorh...@gmail.com wrote
Hi,
We have a machine running CentOS 6.4, and we're attempting to compile Python
3.3.2 on it:
# cat /etc/redhat-release
CentOS release 6.4 (Final)
We've compiled openssl 1.0.1e 11 by hand on this box, and installed it into
/usr/local/:
# openssl
OpenSSL
suggested?
Cheers,
victor
On Tuesday, 29 October 2013 10:43:19 UTC+11, Dennis Lee Bieber wrote:
On Sun, 27 Oct 2013 20:43:07 -0700 (PDT), Victor Hooi
victorh...@gmail.com declaimed the following:
Hi,
I'd like to double-check something regarding using try-except for
controlling
structure my code so that I can run the sync_em.py
and sync_pg.py scripts, and they can pull common functions from somewhere?
Cheers,
Victor
On Tuesday, 29 October 2013 12:08:10 UTC+11, Victor Hooi wrote:
Hi,
If I try to use:
from .common.common_foo import setup_foo_logging
I
,
Victor
On Tuesday, 29 October 2013 18:44:47 UTC+11, Peter Otten wrote:
Victor Hooi wrote:
Hi,
Hmm, this post on SO seems to suggest that importing from another sibling
directory in a package ins't actually possibly in Python without some ugly
hacks?
http
Hi,
I'm attempting to use urlparse.parse_qs() to parse the following url:
https://www.foo.com/cat/dog-13?utm_source=foo1043cutm_medium=emailutm_campaign=ba^Cn=HC
However, when I attempt to parse it, I get:
{'https://www.foo.com/cat/dog-13?utm_source': ['foo1043c'],
'utm_campaign':
=ba^Cn=HC'
In [40]: urlparse.parse_qs(urlparse.urlparse(url).query)
Out[40]:
{'utm_campaign': ['ba^Cn=HC'],
'utm_medium': ['email'],
'utm_source': ['foo1043c']}
Cheers,
Victor
On Wednesday, 30 October 2013 09:34:15 UTC+11, Victor Hooi wrote:
Hi
Hi,
I have a CSV file that I will repeatedly appending to.
I'm using the following to open the file:
with open(self.full_path, 'r') as input, open(self.output_csv, 'ab') as
output:
fieldnames = (...)
csv_writer = DictWriter(output, filednames)
# Call
Hi,
In theory, it *should* just be our script writing to the output CSV file.
However, I wanted it to be robust - e.g. in case somebody spins up two copies
of this script running concurrently.
I suppose the timing would have to be pretty unlucky to hit a race condition
there, right?
As in,
Hi,
NB - I'm the original poster here -
https://groups.google.com/d/topic/comp.lang.python/WUuRLEXJP4E/discussion -
however, that post seems to have diverted, and I suspect my original question
was poorly worded.
I have several Python scripts that use similar functions.
Currently, these
Hi,
We're on Python 2.6 (RHEL based system...) - I don't believe this exposes
FileNotFoundError =(.
Cheers,
Victor
On Monday, 28 October 2013 17:36:05 UTC+11, Chris Angelico wrote:
On Mon, Oct 28, 2013 at 2:43 PM, Victor Hooi victorh...@gmail.com wrote:
Is it acceptable to use try-except
Hi,
Ok, so I should be using absolute imports, not relative imports.
Hmm, I just tried to use absolute imports, and it can't seem to locate the
modules:
In the file foo_loading/em_load/sync_em.py, I have:
from common.common_bex import setup_foo_logging
When I try to run that script:
the
foo_loading/em_load directory?
I thought I could just refer to the full path, and it'd find it, but evidently
not...hmm.
Cheers,
Victor
On Tuesday, 29 October 2013 12:01:03 UTC+11, Ben Finney wrote:
Victor Hooi victorh...@gmail.com writes:
Ok, so I should be using absolute imports
Hi,
I have a collection of Python scripts I'm using to load various bits of data
into a database.
I'd like to move some of the common functions (e.g. to setup loggers, reading
in configuration etc.) into a common file, and import them from there.
I've created empty __init__.py files, and my
Hi,
I'd like to double-check something regarding using try-except for controlling
flow.
I have a script that needs to lookup things in a SQLite database.
If the SQLite database file doesn't exist, I'd like to create an empty
database, and then setup the schema.
Is it acceptable to use
)
raise
that will just re-raise the original exception right?
Cheers,
Victor
On Thursday, 24 October 2013 15:42:53 UTC+11, Andrew Berg wrote:
On 2013.10.23 22:23, Victor Hooi wrote:
For example:
def run_all(self):
self.logger.debug('Running loading job
Hi,
We have a directory of large CSV files that we'd like to process in Python.
We process each input CSV, then generate a corresponding output CSV file.
input CSV - munging text, lookups etc. - output CSV
My question is, what's the most Pythonic way of handling this? (Which I'm
assuming
Hi,
I have a Python class that represents a loading job.
Each job has a run_all() method that calls a number of other class methods.
I'm calling run_all() on a bunch of jobs.
Some of methods called by run_all() can raise exceptions (e.g. missing files,
DB connection failures) which I'm
there?
Cheers,
Victor
On Tuesday, 22 October 2013 14:53:58 UTC+11, Ben Finney wrote:
Victor Hooi victorh...@gmail.com writes:
Aha, good point about IOError encapsulating other things, I'll use
FileNotFoundError, and also add in some other except blocks for the
other ones
Hi,
I suspect I'm holding
How should I use the with context handler as well as handling specific
exceptions?
For example, for a file:
with open('somefile.log', 'wb') as f:
f.write(hello there)
How could I specifically catch IOError in the above, and handle that? Should I
wrap
there - %s % e)
Cheers,
Victor
On Tuesday, 22 October 2013 14:04:14 UTC+11, Ben Finney wrote:
Victor Hooi victorh...@gmail.com writes:
try:
with open('somefile.log', 'wb' as f:
f.write(hello there)
except IOError as e:
logger.error(Uhoh
Hi,
I have a Python script where I want to run fork and run an external command (or
set of commands).
For example, after doing xyz, I then want to run ssh to a host, handover
control back to the user, and have my script terminate.
Or I might want to run ssh to a host, less a certain textfile,
, what's this improvement you mentioned?
Cheers,
Victor
On Wednesday, 3 July 2013 13:59:19 UTC+10, rusi wrote:
On Wednesday, July 3, 2013 9:17:29 AM UTC+5:30, Victor Hooi wrote:
Hi,
I have a Python script where I want to run fork and run an external command
(or set of commands
Hi,
I'm trying to compile a regex Python with the re.VERBOSE flag (so that I can
add some friendly comments).
However, the issue is, I normally use constants to define re-usable bits of the
regex - however, these doesn't get interpreted inside the triple quotes.
For example:
import re
Hi,
I have logline that I need to test against multiple regexes. E.g.:
import re
expression1 = re.compile(r'')
expression2 = re.compile(r'')
with open('log.txt') as f:
for line in f:
if expression1.match(line):
# Do something -
HI,
NB: I've posted this question on Reddit as well (but didn't get many responses
from Pythonistas) - hope it's ok if I post here as well.
We currently use a collection of custom Python scripts to validate various
things in our production environment/configuration.
Many of these are simple
Hi,
I have a Python script that I'd like to spawn a separate process (SSH client,
in this case), and then have the script exit whilst the process continues to
run.
I looked at Subprocess, however, that leaves the script running, and it's more
for spawning processes and then dealing with their
Hi,
I'm trying to compare two logfiles in Python.
One logfile will have lines recording the message being sent:
05:00:06 Message sent - Value A: 5.6, Value B: 6.2, Value C: 9.9
the other logfile has line recording the message being received
05:00:09 Message received - Value A: 5.6,
, 8 January 2013 09:58:36 UTC+11, Oscar Benjamin wrote:
On 7 January 2013 22:10, Victor Hooi victorh...@gmail.com wrote:
Hi,
I'm trying to compare two logfiles in Python.
One logfile will have lines recording the message being sent:
05:00:06 Message sent - Value
Hi,
I'm using pysvn to checkout a specific revision based on date - pysvn will only
accept a date in terms of seconds since the epoch.
I'm attempting to use time.mktime() to convert a date (e.g. 2012-02-01) to
seconds since epoch.
According to the docs, mktime expects a 9-element tuple.
My
Hi,
I have a script that trawls through log files looking for certain error
conditions. These are identified via certain keywords (all different) in those
lines
I then process those lines using regex groups to extract certain fields.
Currently, I'm using a for loop to iterate through the
, 12 Dec 2012 14:35:41 -0800, Victor Hooi wrote:
Hi,
I have a script that trawls through log files looking for certain error
conditions. These are identified via certain keywords (all different) in
those lines
I then process those lines using regex groups to extract
to do it that way.
However, I'd still like to fix up the regex, or fix any glaring issues with it
as well.
Cheers,
Victor
On Thursday, 13 December 2012 17:19:57 UTC+11, Chris Angelico wrote:
On Thu, Dec 13, 2012 at 5:10 PM, Victor Hooi victorh...@gmail.com wrote:
Are there any other general
Hi,
I have a directory tree with various XML configuration files.
I then have separate classes for each type, which all inherit from a base. E.g.
class AnimalConfigurationParser:
...
class DogConfigurationParser(AnimalConfigurationParser):
...
class
Hi,
I'm getting a strange error when I try to run the following:
for root, dirs, files in os.walk('./'):
for file in files:
if file.startswith('ml') and file.endswith('.xml') and 'entity' not
in file:
print(root)
print(file)
Hi,
Ignore me - PEBKAC...lol.
I used root both for the os.walk, and also for the root XML element.
Thanks anyhow =).
Cheers,
Victor
On Monday, 10 December 2012 11:52:34 UTC+11, Victor Hooi wrote:
Hi,
I'm getting a strange error when I try to run the following:
for root
heya,
Dave: Ahah, thanks =).
You're right, my terminology was off, I want to dynamically *instantiate*, not
create new classes.
And yes, removing the brackets worked =).
Cheers,
Victor
On Monday, 10 December 2012 11:53:30 UTC+11, Dave Angel wrote:
On 12/09/2012 07:35 PM, Victor Hooi wrote
Victor Hooi added the comment:
Hi,
I didn't have any buffering size set before, so I believe it defaults to 0 (no
buffering), right? Wouldn't this be the behaviour on both 2.x and 3.x?
I'm using a 1.5 Mb bzip2 file - I just tried setting buffering to 1000 and
100, and it didn't seem
Victor Hooi added the comment:
Hi,
Aha, whoops, sorry Serhiy, didn't see your second message - I think you and I
posted our last messages at nearly the same time...
Cool, looking forward to your patch =).
Also, is there any chance you could provide a more detailed explanation of
what's
New submission from Victor Hooi:
Hi,
I was writing a script to parse BZ2 blogfiles under Python 2.6, and I noticed
that bz2file (http://pypi.python.org/pypi/bz2file) seemed to perform much
slower than with bz2 (native):
http://stackoverflow.com/questions/12575930/is-python-bz2file-slower
Hi,
I have a simple Python script to perform operations on various types on
in-house servers:
manage_servers.py operation type_of_server
Operations are things like check, build, deploy, configure, verify etc.
Types of server are just different types of inhouse servers we use.
We have a
Hi,
I'm attempting to use argparse to write a simple script to perform operations
on various types of servers:
manage_servers.py operation type_of_server
Operations are things like check, build, deploy, configure, verify etc.
Types of server are just different types of inhouse servers we
71 matches
Mail list logo