Re: [Tutor] sorting objects on two attributes

2008-03-03 Thread Eric Abrahamsen

On Mar 4, 2008, at 11:04 AM, Kent Johnson wrote:

> Eric Abrahamsen wrote:
>> Itertools.groupby is totally impenetrable to me
>
> Maybe this will help:
> http://personalpages.tds.net/~kent37/blog/arch_m1_2005_12.html#e69
>
> Kent
>

It did! Thanks very much. I think I understand now what's going on in  
the groupby line. And then this line:

t.append((sorted_part[-1].submit_date, key, sorted_part))

is basically a Decorate-Sort-Undecorate operation, sorting on  
sorted_part[-1].submit_date because that's guaranteed to be the latest  
date in each group?

It's starting to come clear...
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] sorting objects on two attributes

2008-03-03 Thread Kent Johnson
Eric Abrahamsen wrote:
> Itertools.groupby is totally impenetrable to me

Maybe this will help:
http://personalpages.tds.net/~kent37/blog/arch_m1_2005_12.html#e69

Kent

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] sorting objects on two attributes

2008-03-03 Thread Eric Abrahamsen
Well I expected to learn a thing or two, but I didn't expect not to  
understand the suggestions at all! :) Thanks to everyone who  
responded, and sorry for the typo (it was meant to be object_id  
throughout, not content_type).

So far Michael's solution works and is most comprehensible to me – ie  
I stand a fighting chance of figuring it out.

I didn't realize you could make direct use of SortedDicts in django,  
so that's worth a try, too.

Itertools.groupby is totally impenetrable to me, but works as well! Is  
there any consensus about which might go faster? I will go right now  
and google until my brain is full.

Thanks again,

Eric


On Mar 4, 2008, at 12:56 AM, Michael H. Goldwasser wrote:

>
>
> Hello Eric,
>
>  Your basic outlook is fine, but you can do it much more efficiently
>  with a single sort.   Here's the way I'd approach the task  
> (untested):
>
>  # --
>  # first compute the latest date for each id group; uses O(n) time
>  newest = {}
>  for q in queryset:
>  id,date = q.object_id, q.submit_date
>  if id not in newest or date > newest[id]:
>  newest[id] = date
>
>  # now sort based on the following decorator as a key
>  data.sort(reverse=True, key=lambda x: (newest[x.object_id],  
> x.object_id, x.submit_date))
>  # --
>
>  In essence, you compute the max date within each group, but your
>  approach (i.e., building explicit sublists and then repeatedly
>  calling max on those sublists) is far more time-consuming than the
>  above dictionary based approach.
>
>  Note well that using a tuple as a decorator key is more efficient
>  than calling sort separately for each subgroup.   The
>  lexicographical order of the following
>
>(newest[x.object_id], x.object_id, x.submit_date)
>
>  should produce the order that you desire. The reason for the middle
>  entry is to ensure that items groups by object_id in the case that
>  two different groups achieve the same maximum date.  It wasn't clear
>  to me whether you wanted elements within groups from oldest to
>  newest or newest to oldest. I believe that the code I give produces
>  the ordering that you intend, but you may adjust the sign of the
>  decorate elements if necessary.
>
> With regard,
> Michael
>
>   +---
>   | Michael Goldwasser
>   | Associate Professor
>   | Dept. Mathematics and Computer Science
>   | Saint Louis University
>   | 220 North Grand Blvd.
>   | St. Louis, MO 63103-2007
>
>
> On Monday March 3, 2008, Eric Abrahamsen wrote:
>
>>   I have a grisly little sorting problem to which I've hacked  
>> together a
>>   solution, but I'm hoping someone here might have a better  
>> suggestion.
>>
>>   I have a list of objects, each of which has two attributes,  
>> object_id
>>   and submit_date. What I want is to sort them by content_type,  
>> then by
>>   submit_date within content_type, and then sort each content_type  
>> block
>>   according to which block has the newest object by submit_date.  
>> (This
>>   sequence of sorting might not be optimal, I'm not sure). I'm  
>> actually
>>   creating a list of recent comments on blog entries for a python- 
>> based
>>   web framework, and want to arrange the comments according to blog
>>   entry (content_type), by submit_date within that entry, with the
>>   entries with the newest comments showing up on top.
>>
>>   I don't believe a single cmp function fed to list.sort() can do  
>> this,
>>   because you can't know how two objects should be compared until you
>>   know all the values for all the objects. I'd be happy to be proven
>>   wrong here.
>>
>>   After some false starts with dictionaries, here's what I've got.
>>   Queryset is the original list of comments (I'm doing this in  
>> django),
>>   and it returns a list of lists, though I might flatten it  
>> afterwards.
>>   It works, but it's ghastly unreadable and if there were a more
>>   graceful solution I'd feel better about life in general:
>>
>>
>>   def make_com_list(queryset):
>>   ids = set([com.object_id for com in queryset])
>>   xlist = [[com for com in queryset if com.object_id == i] for  
>> i in
>>   ids]
>>   for ls in xlist:
>>   ls.sort(key=lambda x: x.submit_date)
>>   xlist.sort(key=lambda x: max([com.submit_date for com in x]),
>>   reverse=True)
>>   return xlist
>>
>>   I'd appreciate any hints!
>>
>>   Thanks,
>>   Eric
>>
>

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Python and displaying mathematical equations?

2008-03-03 Thread John Fouhy
On 04/03/2008, Shuai Jiang (Runiteking1) <[EMAIL PROTECTED]> wrote:
> Hello, I'm trying to create an application that retrieves and displays
> (probably in HTML or PDF format) math problems from a database.
> The problem is that I need some sort of mechanism to display mathematical
> equations.

If you can describe your equations in MathML then there may be options
for you -- a quick google for "python mathml" turned up a few hits --
e.g. http://sourceforge.net/projects/pymathml/ or
http://www.grigoriev.ru/svgmath/ (if you accept SVG as an output).

-- 
John.
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Python and displaying mathematical equations?

2008-03-03 Thread Shuai Jiang (Runiteking1)
Hello, I'm trying to create an application that retrieves and displays
(probably in HTML or PDF format) math problems from a database.
The problem is that I need some sort of mechanism to display mathematical
equations.

I'm trying to not use Latex as it would cause the users to install Latex
(and most likely dvipng) on their own computers.

Is there a lighter version of Latex for Python or do I have to just use
Latex?

Thanks!

Marshall Jiang

-- 
Visit my blog at runiteking1.blogspot.com
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] [tutor] Finding image statistics

2008-03-03 Thread Kent Johnson
Varsha Purohit wrote:
> Yeahh so by doing this i am counting only the difference part since we 
> have grayscaled the image and assuming it will count only the pixels 
> that evolve as difference 

Yes

> if i use sum2 instead of sum i think  it 
> will give squared sum which is area... and if i just use count it would 
> count the number of pixels developed like that... sounds interesting .. 
> thanks for throwing light for me in right direction

No. First, pixels are already a measure of area. Second, if each pixel 
value is 0 or 1, squaring the values won't make any difference.

Kent
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] [tutor] Finding image statistics

2008-03-03 Thread Varsha Purohit
Yeahh so by doing this i am counting only the difference part since we have
grayscaled the image and assuming it will count only the pixels that evolve
as difference if i use sum2 instead of sum i think  it will give squared
sum which is area... and if i just use count it would count the number of
pixels developed like that... sounds interesting .. thanks for throwing
light for me in right direction

On Sun, Mar 2, 2008 at 4:45 PM, Kent Johnson <[EMAIL PROTECTED]> wrote:

> Varsha Purohit wrote:
> > I am getting this list as an output
> >
> > [268541.0, 264014.0, 324155.0]
>
> This is the sum of the values of the red pixels, etc. They are not
> counts, but sums. So if you have an image with the two pixels
> (1, 2, 3), (4, 5, 6) you would get a sum of (5, 7, 9).
> >
> > Actually in my code i am finding difference between two images and i
> > need to count the number of pixels which appear as a difference of these
> > two images. These values i m getting as output are little large.
>
> So you want a count of the number of pixels that differ between the two
> images? Maybe this:
>
> diff = ImageChops.difference(file1, file2)
>
> # Convert to grayscale so differences are counted just once per pixel
> diff = ImageOps.grayscale(diff)
>
> # Convert each difference to 0 or 1 so we can count them
> # Clipping function
> def clip(x):
>   return 1 if x >= 1 else 0
>
> # Apply the clipping function
> diff = Image.eval(diff, clip)
>
> print ImageStat.Stat(diff).sum
>
> Kent
>
> >
> > i m pasting my code again
> >
> > file1=Image.open("./pics/original.jpg")
> > file2=Image.open(val)
> > diff = ImageChops.subtract(file1,file2,0.3)
> > stat1 = ImageStat.Stat(diff)
> >
> > count1=stat1.sum
> > print count1
> > diff.save("./pics/diff"+".jpg")
> >
> > diff.show()
> >
> > AS you can see i am finding difference of two images which are nearly
> > identical. But they have some pixels that are different and i can see
> > that in the output image diff.jpg. So i am trying to put this difference
> > image in the imagestat and trying to see if i can get to count the
> > number of pixels ...
>
>


-- 
Varsha Purohit,
Graduate Student
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] sorting objects on two attributes

2008-03-03 Thread Michael H. Goldwasser


Hello Eric,

  Your basic outlook is fine, but you can do it much more efficiently
  with a single sort.   Here's the way I'd approach the task (untested):

  # --
  # first compute the latest date for each id group; uses O(n) time
  newest = {}
  for q in queryset:
  id,date = q.object_id, q.submit_date
  if id not in newest or date > newest[id]:
  newest[id] = date

  # now sort based on the following decorator as a key
  data.sort(reverse=True, key=lambda x: (newest[x.object_id], x.object_id, 
x.submit_date))
  # --

  In essence, you compute the max date within each group, but your
  approach (i.e., building explicit sublists and then repeatedly
  calling max on those sublists) is far more time-consuming than the
  above dictionary based approach.

  Note well that using a tuple as a decorator key is more efficient
  than calling sort separately for each subgroup.   The
  lexicographical order of the following

(newest[x.object_id], x.object_id, x.submit_date)

  should produce the order that you desire. The reason for the middle
  entry is to ensure that items groups by object_id in the case that
  two different groups achieve the same maximum date.  It wasn't clear
  to me whether you wanted elements within groups from oldest to
  newest or newest to oldest. I believe that the code I give produces
  the ordering that you intend, but you may adjust the sign of the
  decorate elements if necessary.

With regard,
Michael

   +---
   | Michael Goldwasser
   | Associate Professor
   | Dept. Mathematics and Computer Science
   | Saint Louis University
   | 220 North Grand Blvd.
   | St. Louis, MO 63103-2007


On Monday March 3, 2008, Eric Abrahamsen wrote: 

>I have a grisly little sorting problem to which I've hacked together a  
>solution, but I'm hoping someone here might have a better suggestion.
>
>I have a list of objects, each of which has two attributes, object_id  
>and submit_date. What I want is to sort them by content_type, then by  
>submit_date within content_type, and then sort each content_type block  
>according to which block has the newest object by submit_date. (This  
>sequence of sorting might not be optimal, I'm not sure). I'm actually  
>creating a list of recent comments on blog entries for a python-based  
>web framework, and want to arrange the comments according to blog  
>entry (content_type), by submit_date within that entry, with the  
>entries with the newest comments showing up on top.
>
>I don't believe a single cmp function fed to list.sort() can do this,  
>because you can't know how two objects should be compared until you  
>know all the values for all the objects. I'd be happy to be proven  
>wrong here.
>
>After some false starts with dictionaries, here's what I've got.  
>Queryset is the original list of comments (I'm doing this in django),  
>and it returns a list of lists, though I might flatten it afterwards.  
>It works, but it's ghastly unreadable and if there were a more  
>graceful solution I'd feel better about life in general:
>
>
>def make_com_list(queryset):
>ids = set([com.object_id for com in queryset])
>xlist = [[com for com in queryset if com.object_id == i] for i in  
>ids]
>for ls in xlist:
>ls.sort(key=lambda x: x.submit_date)
>xlist.sort(key=lambda x: max([com.submit_date for com in x]),  
>reverse=True)
>return xlist
>
>I'd appreciate any hints!
>
>Thanks,
>Eric
>

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] sorting objects on two attributes

2008-03-03 Thread Kent Johnson
Chris Fuller wrote:
> You could have a hierarchical sort 
> function:
> 
> def hiersort(a,b):
>if a.attr1 != b.attr1:
>   return cmp(a.attr1, b.attr1)
>else:
>   if a.attr2 != b.attr2:
>  return cmp(a.attr2, b.attr2)
>   else:
>  return cmp(a.attr3, b.att3)
> 
> 
> l.sort(hiersort)

That is exactly what l.sort(key=lambda x: (x.attr1, x.attr2, x.attr3)) 
does, except the key= version is simpler and most likely faster.

You can also use
   l.sort(key=operator.attrgetter('attr1', 'attr2', 'attr3'))

> You can keep nesting for more than three attributes, or you could make it 
> arbitrary by setting it up recursively and setting the attribute hierarchy as 
> a parameter somewhere.  But that's probably unnecessarily fancy.

   l.sort(key=operator.attrgetter(*list_of_attribute_names))
should work...

Kent

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] sorting objects on two attributes

2008-03-03 Thread Chris Fuller
Almost.  And better than my original idea.  You could have a hierarchical sort 
function:

def hiersort(a,b):
   if a.attr1 != b.attr1:
  return cmp(a.attr1, b.attr1)
   else:
  if a.attr2 != b.attr2:
 return cmp(a.attr2, b.attr2)
  else:
 return cmp(a.attr3, b.att3)


l.sort(hiersort)

You can keep nesting for more than three attributes, or you could make it 
arbitrary by setting it up recursively and setting the attribute hierarchy as 
a parameter somewhere.  But that's probably unnecessarily fancy.

Cheers

On Monday 03 March 2008 08:42, Andreas Kostyrka wrote:
> Well, this assumes that all named attributes do exist. If not, you need
> to replace x.attr with getattr(x, "attr", defaultvalue) ;)
>
>
> l.sort(key=lambda x: (x.content_type, x.submit_date))
>
> Now, you can construct a sorted list "t":
>
> t = []
> for key, item_iterator in itertools.groupby(l, key=lambda x:
> (x.content_type, x.submit_date)): sorted_part = sorted(item_iterator,
> key=lambda x: x.submit_date) t.append((sorted_part[-1].submit_date, key,
> sorted_part))
>
> t.sort()
>
> t = sum([x[2] for x in t], [])
>
> Totally untested, as written in the MTA :)
>
> Andreas
>
> Am Montag, den 03.03.2008, 22:19 +0800 schrieb Eric Abrahamsen:
> > I have a grisly little sorting problem to which I've hacked together a
> > solution, but I'm hoping someone here might have a better suggestion.
> >
> > I have a list of objects, each of which has two attributes, object_id
> > and submit_date. What I want is to sort them by content_type, then by
> > submit_date within content_type, and then sort each content_type block
> > according to which block has the newest object by submit_date. (This
> > sequence of sorting might not be optimal, I'm not sure). I'm actually
> > creating a list of recent comments on blog entries for a python-based
> > web framework, and want to arrange the comments according to blog
> > entry (content_type), by submit_date within that entry, with the
> > entries with the newest comments showing up on top.
> >
> > I don't believe a single cmp function fed to list.sort() can do this,
> > because you can't know how two objects should be compared until you
> > know all the values for all the objects. I'd be happy to be proven
> > wrong here.
> >
> > After some false starts with dictionaries, here's what I've got.
> > Queryset is the original list of comments (I'm doing this in django),
> > and it returns a list of lists, though I might flatten it afterwards.
> > It works, but it's ghastly unreadable and if there were a more
> > graceful solution I'd feel better about life in general:
> >
> >
> > def make_com_list(queryset):
> > ids = set([com.object_id for com in queryset])
> > xlist = [[com for com in queryset if com.object_id == i] for i in
> > ids]
> > for ls in xlist:
> > ls.sort(key=lambda x: x.submit_date)
> > xlist.sort(key=lambda x: max([com.submit_date for com in x]),
> > reverse=True)
> > return xlist
> >
> > I'd appreciate any hints!
> >
> > Thanks,
> > Eric
> > ___
> > Tutor maillist  -  Tutor@python.org
> > http://mail.python.org/mailman/listinfo/tutor
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] sorting objects on two attributes

2008-03-03 Thread Kent Johnson
Andreas Kostyrka wrote:

> l.sort(key=lambda x: (x.content_type, x.submit_date))
> 
> Now, you can construct a sorted list "t":
> 
> t = []
> for key, item_iterator in itertools.groupby(l, key=lambda x: (x.content_type, 
> x.submit_date)):
> sorted_part = sorted(item_iterator, key=lambda x: x.submit_date)
> t.append((sorted_part[-1].submit_date, key, sorted_part))

I think you mean

for key, item_iterator in itertools.groupby(l, key=lambda x: 
(x.content_type)): # Group by content type only
 sorted_part = list(item_iterator) # No need to sort again
 t.append((sorted_part[-1].submit_date, key, sorted_part))

Kent
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] sorting objects on two attributes

2008-03-03 Thread Kent Johnson
Eric Abrahamsen wrote:

> I have a list of objects, each of which has two attributes, object_id  
> and submit_date. What I want is to sort them by content_type, then by  
> submit_date within content_type, and then sort each content_type block  
> according to which block has the newest object by submit_date. (This  
> sequence of sorting might not be optimal, I'm not sure). I'm actually  
> creating a list of recent comments on blog entries for a python-based  
> web framework, and want to arrange the comments according to blog  
> entry (content_type), by submit_date within that entry, with the  
> entries with the newest comments showing up on top.

This description doesn't match your code. There is no content_type in 
the code.

I think what you want to do is group the comments by object_id, sort 
within each object_id group by submit_date, then sort the groups by most 
recent submit date.

> I don't believe a single cmp function fed to list.sort() can do this,  
> because you can't know how two objects should be compared until you  
> know all the values for all the objects. I'd be happy to be proven  
> wrong here.

Django's SortedDict might help. Perhaps this:

from operator import attrgetter
from django.utils.datastructures import SortedDict

sd = SortedDict()
for com in sorted(queryset, key=attrgetter.submit_date, reverse=True):
   sd.setdefault(com.object.id, []).append(com)


Now sd.keys() is a list of object_ids in descending order by most recent 
comment, and sd[object_id] is a list of comments for object_id, also in 
descending order by submit_date. If you want the comments in increasing 
date order (which I think your code below does) then you have to reverse 
the lists of comments, e.g.
   for l in sd.values():
 l.reverse()

or just reverse at the point of use with the reversed() iterator.

> def make_com_list(queryset):
> ids = set([com.object_id for com in queryset])
> xlist = [[com for com in queryset if com.object_id == i] for i in  
> ids]
> for ls in xlist:
> ls.sort(key=lambda x: x.submit_date)
> xlist.sort(key=lambda x: max([com.submit_date for com in x]),  
> reverse=True)

No need for max() since the list is sorted; use
   key=lambda x: x[-1].submit_date

Kent
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] sorting objects on two attributes

2008-03-03 Thread Andreas Kostyrka
Well, this assumes that all named attributes do exist. If not, you need
to replace x.attr with getattr(x, "attr", defaultvalue) ;)


l.sort(key=lambda x: (x.content_type, x.submit_date))

Now, you can construct a sorted list "t":

t = []
for key, item_iterator in itertools.groupby(l, key=lambda x: (x.content_type, 
x.submit_date)):
sorted_part = sorted(item_iterator, key=lambda x: x.submit_date)
t.append((sorted_part[-1].submit_date, key, sorted_part))

t.sort()

t = sum([x[2] for x in t], [])

Totally untested, as written in the MTA :)

Andreas


Am Montag, den 03.03.2008, 22:19 +0800 schrieb Eric Abrahamsen:
> I have a grisly little sorting problem to which I've hacked together a  
> solution, but I'm hoping someone here might have a better suggestion.
> 
> I have a list of objects, each of which has two attributes, object_id  
> and submit_date. What I want is to sort them by content_type, then by  
> submit_date within content_type, and then sort each content_type block  
> according to which block has the newest object by submit_date. (This  
> sequence of sorting might not be optimal, I'm not sure). I'm actually  
> creating a list of recent comments on blog entries for a python-based  
> web framework, and want to arrange the comments according to blog  
> entry (content_type), by submit_date within that entry, with the  
> entries with the newest comments showing up on top.
> 
> I don't believe a single cmp function fed to list.sort() can do this,  
> because you can't know how two objects should be compared until you  
> know all the values for all the objects. I'd be happy to be proven  
> wrong here.
> 
> After some false starts with dictionaries, here's what I've got.  
> Queryset is the original list of comments (I'm doing this in django),  
> and it returns a list of lists, though I might flatten it afterwards.  
> It works, but it's ghastly unreadable and if there were a more  
> graceful solution I'd feel better about life in general:
> 
> 
> def make_com_list(queryset):
> ids = set([com.object_id for com in queryset])
> xlist = [[com for com in queryset if com.object_id == i] for i in  
> ids]
> for ls in xlist:
> ls.sort(key=lambda x: x.submit_date)
> xlist.sort(key=lambda x: max([com.submit_date for com in x]),  
> reverse=True)
> return xlist
> 
> I'd appreciate any hints!
> 
> Thanks,
> Eric
> ___
> Tutor maillist  -  Tutor@python.org
> http://mail.python.org/mailman/listinfo/tutor


signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] sorting objects on two attributes

2008-03-03 Thread Eric Abrahamsen
I have a grisly little sorting problem to which I've hacked together a  
solution, but I'm hoping someone here might have a better suggestion.

I have a list of objects, each of which has two attributes, object_id  
and submit_date. What I want is to sort them by content_type, then by  
submit_date within content_type, and then sort each content_type block  
according to which block has the newest object by submit_date. (This  
sequence of sorting might not be optimal, I'm not sure). I'm actually  
creating a list of recent comments on blog entries for a python-based  
web framework, and want to arrange the comments according to blog  
entry (content_type), by submit_date within that entry, with the  
entries with the newest comments showing up on top.

I don't believe a single cmp function fed to list.sort() can do this,  
because you can't know how two objects should be compared until you  
know all the values for all the objects. I'd be happy to be proven  
wrong here.

After some false starts with dictionaries, here's what I've got.  
Queryset is the original list of comments (I'm doing this in django),  
and it returns a list of lists, though I might flatten it afterwards.  
It works, but it's ghastly unreadable and if there were a more  
graceful solution I'd feel better about life in general:


def make_com_list(queryset):
ids = set([com.object_id for com in queryset])
xlist = [[com for com in queryset if com.object_id == i] for i in  
ids]
for ls in xlist:
ls.sort(key=lambda x: x.submit_date)
xlist.sort(key=lambda x: max([com.submit_date for com in x]),  
reverse=True)
return xlist

I'd appreciate any hints!

Thanks,
Eric
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Need help with encoder & decryption keys

2008-03-03 Thread Andreas Kostyrka
Well, actually, ssh can also protect private keys with a cryptographic 
pass phrase. But this is often not what is wanted as it implies that the 
user needs to provide the pass phrase every time it is used. (Well, 
that's not the complete truth, man ssh-agent, but that's completely 
different thing ;) )

Andreas

Kent Johnson wrote:
> Trey Keown wrote:
> 
>> mmm... So, what would be an effective way to hide the data's key?
>> I'm kind of new to the whole encryption scene, although I've had some
>> experience with it whilst working on homebrew software on gaming
>> platforms.
> 
> I don't know, I'm not a crypto expert. I guess it depends partly on what 
> you are trying to do.
> 
> I do have a little experience with SSH, it stores keys in files in the 
> filesystem and relies on the OS to protect them from unauthorized access.
> 
> Kent
> 
> PS Please use Reply All to reply to the list.
> ___
> Tutor maillist  -  Tutor@python.org
> http://mail.python.org/mailman/listinfo/tutor
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Need help with encoder & decryption keys

2008-03-03 Thread Andreas Kostyrka
And if it's a string constant, in many cases running strings (Unix 
program) on the pyc file will reveal it too.

All this basically turns down to the problem, that it's hard to embed an 
encryption key in a program, so that it's not possible to extract it.

Notice the the current crop of HDDVD/Blueray decrypters, that tend to 
derive their copy of the key by extracting it from legal software players.

(Hint: it's a basically unsolvable problem, notice how the industry 
"solved" it by making that kind of analysis illegal, DMCA is the hint here.)

If you want to have "typical" C program security for embedded key data, 
the only thing that you can do is to make a Pyrex module (Which gets 
compiled to C, and is loaded as an .so shared library).

As pointed out above, this is not really safe, it just makes it slightly 
harder to extract the data.

Some thoughts:

1.) the data cannot be made accessible to Python, or else you can read 
it out. That means decryption/encryption needs to be done in Pyrex/C. 
PLUS Pyrex should not use any Python functions/modules to achieve the 
goal. (import secretmodule ; print secretmodule.value. OR: import md5 ; 
md5.md5 = myLoggingMd5Replacement )

2.) the stuff done in your C module cannot be to trivial, or an attacker 
that is determinated can easily remove references to the module.

So to summarize: Your idea usually makes no sense. It clearly falls into 
the "know what you are doing and why" ** n category, with n > 1 ;)
There might be technical or legal reasons to do that, BUT they are 
really, really seldom.

And this category question is usually not really appropriate for a 
tutoring mailing list :)

The only way to make sure something is not compromised is to avoid 
giving it out completely, and put it on your own servers. Again, the 
above thoughts apply. Plus consider that your program is running in an 
environment that you do not control: E.g. even if you communicate via 
SSL and check certificates, and embed the CA certificate into your app 
so that it cannot be easily replaced. Consider loading a small wrapper 
for SSL_read/SSL_write via LD_PRELOAD. Oops, the user just learned the 
complete clear text of your communication. Worse if it's to simple he 
can just replace the data, worst case by loading a wrapper around 
openssl that mimics the server.

It's no fun, and the easiest way is to give your users the access (e.g. 
the source code). This stops the crowd that has to prove itself for the 
fun of breaking a security system. (And yes, there are people that do 
that). Then put your rules into the LICENSE. That lays out the rules 
what is allowed. And in some cases, add some lines that check 
"licenses", so a customer cannot claim that he ran your program 
unauthorized by mistake. Not much more you can do here, sorry.

Making it hard to remove the license check means just that the fun crowd 
might get motivated into breaking it. Leaving out at least some checks 
means that users can claim that they ran a copy of your program by 
mistake. That might have legal ramifications, and worse, in many cases 
(if you are not a member of BSA), you wouldn't want to sue a customer 
anyway, right? (Nothing surer to piss off a customer than to sue him.)

Andreas

Kent Johnson wrote:
> Trey Keown wrote:
>> is it
>> possible to decompile things within a .pyc file?
> 
> Yes, it is possible. There is a commercial service that will do this, 
> for older versions of Python at least.
> 
> To figure out a secret key kept in a .pyc file it might be enough to 
> disassemble functions in the module; that can be done with the standard 
> dis module.
> 
> And of course if you stored the secrets as module globals all you have 
> to do is import the module and print the value...
> 
> Kent
> ___
> Tutor maillist  -  Tutor@python.org
> http://mail.python.org/mailman/listinfo/tutor
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor