Re: Question about PANDAS

2014-10-20 Thread Johann Hibschman
giacomo boffi  writes:

>  2. choose ONE flavour of python, either 2.7.x or 3.4.x
> - future is with 3.4,
> - most exaples you'll find were written (are still written...)
>   for 2.7.x

If you're interested in statistics (as comparisons to R suggest), I'd
recommend anaconda.  It comes with pandas built-in, for one.  I'd also
suggest the 3.4 version.  Finally, just over the past few months, we've
crossed over to where 3.4 is fully-functional in the anaconda
distribution.  (For a while statsmodels was the hold-out; matplotlib had
problems before that.  Now, though, all is good.)

Cheers,
Johann
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: I'm looking to start a team of developers, quants, and financial experts, to setup and manage an auto-trading-money-making-machine

2014-10-14 Thread Johann Hibschman
Marko Rauhamaa  writes:

> ryguy7272 :
>
>> I'm looking to start a team of developers, quants, and financial
>> experts, to setup and manage an auto-trading-money-making-machine
>
> This has already been done: http://en.wikipedia.org/wiki/Sampo

And mocked by MST3K ("sampo means flavor!"):

  https://www.youtube.com/watch?v=cdfUkrbNvwA

-Johann (whose cousins are all Mattinens and Nikkanens)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: GCD in Fractions

2014-09-24 Thread Johann Hibschman
Steven D'Aprano  writes:

> blindanagram wrote:
>
>> Seccondly (as others here have pointed out), the mathematical properties
>> of the greatest common divisor are well defined for both positive and
>> negative integers.
>
> You keep saying that, but it simply is not true. Different people use
> different definitions. Some refuse to allow negative arguments at all. Some
> insist that the GCD must be positive. Others allow it to be negative.

I can't find a good source for allowing it to be negative, though.
Clearly, the primary use of the function is on the positive integers,
with the negatives being an extension.

> Mathworld does show one thing that suggests an interpretation for the GCD of
> negative values:
>
>  The GCD is distributive
> GCD(ma,mb)=mGCD(a,b) 
>
> which tells us that:
>
> GCD(-x, -y) = -GCD(x, y)
>
> And yet, Mathematica has:
>
> GCD(-x, -y) = GCD(x, y)
>
> the very opposite of what Mathworld says, despite coming from the same
> people.

This is most likely simply them dropping the constraint that m must be
non-negative.  Wikipedia, for example, specifies it under "Properties."

> The Collins Dictionary of Mathematics (second edition, 2002) says:
>
> highest common factor, greatest common factor, or greatest 
> common divisor (abbrev hcf, gcf, gcd)
>
> n, an integer d that exactly divides (sense 2) two given 
> integers a and b, and is such that if c divides a and b, 
> then c divides d; this definition extends to finite sets 
> of integers and to integral domains. For example, the 
> highest common factor of 12, 60 and 84 is 12.
>
> Yet again, we have no clear definition for negative values.

As pointed out, this definition always yields two values (positive and
negative), even for positive a and b, so there's nothing special for
negative a or b.  Typically, I've seen this augmented with "choose the
positive one" to get a single value.

> Here's an example using Euclid's algorithm to calculate the GCD of negative
> numbers, and sure enough, you get a negative result:

The algorithm is pretty irrelevant here.  gcd's not defined by a
particular algorithm to calculate it.

>From everything that I've seen, mathematicians consider the gcd to be
always positive.

Now, that's not saying that fraction should implement the mathematical
gcd, if it doesn't need it.  That should be its own argument, though;
it doesn't help to add false doubt about what the gcd of negative
numbers should be.

Cheers,
Johann
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why Python 4.0 won't be like Python 3.0

2014-08-19 Thread Johann Hibschman
Skip Montanaro  writes:

> On Tue, Aug 19, 2014 at 9:27 AM, Grant Edwards  
> wrote:
>> I'm probably conflating the 1.5.2/2.0 and the 2.6 stuff.  I do
>> remember delaying moving from 1.5.2 -> 2.0 until I really had to, but
>> I don't remember why.
>
> If you were a RedHat user during that timeframe, that might have
> contributed to your decision to delay. I no longer remember the
> details, but it was rather painful.

I vaguely remember holding off for a while until SWIG had 2.0 support,
or maybe Numeric lagged, or something, but that's getting pretty fuzzy.
There was definitely more there than, say, for 1.4 to 1.5.  It's hard to
believe that the Dubois/Hinsen/Hugunin article in Computers in Physics
(which was where I got my start with python) was a full 18 years ago.

Cheers,
Johann
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python in financial services

2014-08-12 Thread Johann Hibschman
Rustom Mody  writes:

> Ive been asked to formulate a python course for financial services
> folk.
>
> If I actually knew about the subject, I'd have fatter pockets!
> Anyway heres some thoughts. What I am missing out?

Good luck!  It's a pretty broad field, so everyone probably has
different needs.

> - Libraries -- Decimal?

I've never seen decimal used, even though it makes sense for
accounting-style finance.  I've mostly been looking at forecasts,
trading, and risk, where floats are fine.  So maybe mention that it
exists, so people know where to look if they need it, but don't stress
it.

> - scripts -- philosophy and infrastructure eg argparse, os.path

Basic argparse is very handy, but, again, I wouldn't spend too much time
on it.

> - Pandas
> - Numpy Scipy (which? how much?)

For me, pandas is huge, numpy is a nice fundamental substrate, while
only bits and pieces of scipy are used (mostly optimization).
statsmodels may also be worth a mention, as the answer to "how do I do a
regression".

> - ipython + matplotlib + ??

Ipython notebook + matplotlib is great.  At least show that it exists.
pandas plots may be enough, though.

> - Database interfacing

Definitely mention.

> - Excel interfacing (couple of libraries.. which?)

Meh, maybe.  At least give a strategy.  It always seems like a fool's
errand, though: I end up just dumping data to CSV and using that.

> - C(C++?) interfacing paradigms -- ranging from ctypes, cython to
>   classic lo-level

Probably not, but it depends on the audience.  The overview, like
"ctypes will link to C-like libraries, cython lets you write python-like
code that runs fast, and there's SWIG and Boost.Python if you want to
write your own modules" is about all you need.

Hope that helps,
Johann
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: NaN comparisons - Call For Anecdotes

2014-07-17 Thread Johann Hibschman
Chris Angelico  writes:

> But you also don't know that he hasn't. NaN doesn't mean "unknown", it
> means "Not a Number". You need a more sophisticated system that allows
> for uncertainty in your data.

Regardless of whether this is the right design, it's still an example of
use.

As to the design, using NaN to implement NA is a hack with a long
history, see

  http://www.numpy.org/NA-overview.html

for some color.  Using NaN gets us a hardware-accelerated implementation
with just about the right semantics.  In a real example, these lists are
numpy arrays with tens of millions of elements, so this isn't a trivial
benefit.  (Technically, that's what's in the database; a given analysis
may look at a sample of 100k or so.)

> You have a special business case here (the need to
> record information with a "maybe" state), and you need to cope with
> it, which means dedicated logic and planning and design and code.

Yes, in principle.  In practice, everyone is used to the semantics of
R-style missing data, which are reasonably well-matched by nan.  In
principle, (NA == 1.0) should be a NA (missing) truth value, as should
(NA == NA), but in practice having it be False is more useful.  As an
example, indexing R vectors by a boolean vector containing NA yields NA
results, which is a feature that I never want.

Cheers,
Johann
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: NaN comparisons - Call For Anecdotes

2014-07-17 Thread Johann Hibschman
"Anders J. Munch" <2...@jmunch.dk> writes:
> So far I received exactly the answer I was expecting.  0 examples of
> NaN!=NaN being beneficial.
> I wasn't asking for help, I was making a point.  Whether that will
> lead to improvement of Python, well, I'm not too optimistic, but I
> feel the point was worth making regardless.

Well, I just spotted this thread.  An easy example is, well, pretty much
any case where SQL NULL would be useful.  Say I have lists of borrowers,
the amount owed, and the amount they paid so far.

nan = float("nan")
borrowers = ["Alice", "Bob", "Clem", "Dan"]
amount_owed = [100.0, nan, 200.0, 300.0]
amount_paid = [100.0, nan, nan, 200.0]
who_paid_off = [b for (b, ao, ap) in
  zip(borrowers, amount_owed, amount_paid)
  if ao == ap]

I want to just get Alice from that list, not Bob.  I don't know how much
Bow owes or how much he's paid, so I certainly don't know that he's paid
off his loan.

Cheers,
Johann
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: grimace: a fluent regular expression generator in Python

2013-07-17 Thread Johann Hibschman
Ben Last  writes:

> Good points. I wanted to find a syntax that allows comments as well as
> being fluent:
> RE()
> .any_number_of.digits # Recall that any_number_of includes zero 
> .followed_by.an_optional.dot.then.at_least_one.digit # The dot is
> specifically optional
> # but we must have one digit as a minimum
> .as_string()

Speaking of syntax, have you looked at pyparsing?  I like their
pattern-matching syntax, and I can see it being applied to regexes.

They use an operator-heavy syntax, like:

'(' + digits * 3 + ')-' + digits * 3 + '-' + digits * 4

That seems easier for me to read than the foo.then.follow syntax.

That then makes me think of ometa, which is a fun read, but probably not
completely relevant.

Regards,
Johann
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Best Scripting Language for Embedded Work?

2013-07-10 Thread Johann Hibschman
David T. Ashley  writes:

> We develop embedded software for 32-bit micros using Windows as the
> development platform.
...
> I know that Tcl/Tk would do all of the above, but what about Python?
> Any other alternatives?

Given that list, I'd say just use Tcl and be done.  You could force the
square peg of python into that round hole, but I doubt it'd be worth the
effort.

Cheers,
Johann
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Encoding NaN in JSON

2013-04-17 Thread Johann Hibschman
Miki Tebeka  writes:

>>> I'm trying to find a way to have json emit float('NaN') as 'N/A'.
>> No.  There is no way to represent NaN in JSON.  It's simply not part of the
>> specification.
> I know that. I'm trying to emit the *string* 'N/A' for every NaN.

Easiest way is probably to transform your object before you try to write
it, e.g.

  def transform(x):
  if isinstance(x, dict):
  return dict((k, transform(v)) for k, v in x.items())
  elif isinstance(x, list) or isinstance(x, tuple):
  return [transform(v) for v in x]
  elif isinstance(x, float) and x != x:
  return 'N/A'
  else:
  return x

Then just use

  json.dumps(transform(x))

rather than just

  json.dumps(x)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: allow line break at operators

2011-08-15 Thread Johann Hibschman
Chris Angelico  writes:

> Why is left-to-right inherently more logical than
> multiplication-before-addition? Why is it more logical than
> right-to-left? And why is changing people's expectations more logical
> than fulfilling them? Python uses the + and - symbols to mean addition
> and subtraction for good reason. Let's not alienate the mathematical
> mind by violating this rule. It would be far safer to go the other way
> and demand parentheses on everything.

I'm a clearly a fool for allowing myself to be drawn into this thread,
but I've been playing a lot recently with the APL-derivative language J,
which uses a right-to-left operator precendence rule.

Pragmatically, this is because J defines roughly a bajillion operators,
and it would be impossible to remember the precendence of them all, but
it makes sense in its own way.

If you read "3 * 10 + 7", using right-to-left, you get "three times
something".  Then you read more and you get "three times (ten plus
something)."  And finally, you get "3*(10+7)".  The prefix gives the
continuation for the rest of the calculation; no matter what you
substitute for X in "3*X", you will always just evaluate X, then multply
it by 3.  Likewise, for "3*10+X", no matter what X is, you know you'll
add 10 and multiply by 3.

This took me a while to get used to, but it's definitely a nice
property.  Not much to do with python, but I do like the syntax enough
that I've implemented my own toy evaluator for J-like expressions in
python, to get around the verbosity of some bits of numpy.

Regards,
Johann
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: I am fed up with Python GUI toolkits...

2011-07-20 Thread Johann Hibschman
Thomas Jollans  writes:

> On 20/07/11 04:12, sturlamolden wrote:
>> 3. Unpythonic memory management: Python references to deleted C++
>> objects (PyQt). Manual dialog destruction (wxPython). Parent-child
>> ownership might be smart in C++, but in Python we have a garbage
>> collector.
>
> I wonder - what do you think of GTK+?
> I've only used Qt with C++, and I've always been highly suspicious of wx
> (something about the API, or the documentation… I haven't had a look at
> it in a long time), but I always found PyGTK quite nice.

GTK+ doesn't work well at all on Mac, so if "cross-platform" includes
Macs, it's not a contender.

To quote the gtk-osx.sourceforge.net page:

   Developers considering GTK+ as a cross-platform environment for new
   work are advised to evaluate other toolkits carefully before
   committing to GTK if they consider OSX an important market.

From experience, GTK apps are pretty awful on OSX.

-Johann
-- 
http://mail.python.org/mailman/listinfo/python-list