Re: Newbie question about Python syntax

2019-08-26 Thread Paul St George

On 25/08/2019 02:39, Cameron Simpson wrote:

On 24Aug2019 21:52, Paul St George  wrote:

[snip]>
Aside from "map" being a poor name (it is also a builtin Python 
function), it seems that one creates one of these to control how some 
rendering process is done.


The class reference page you originally cites then specifies the meaning 
of the various attributes you might set on one of these objects.


Cheers,
Cameron Simpson 


Thanks Cameron. As this list has a low noise to signal ratio I cannot 
thank you enough here.


I could have stayed where I belong in Blender Artists, or similar, but 
those lists tend to just offer solutions and as Douglas Adams almost 
said knowledge without understanding is almost meaningless. Here I have 
gained enough understanding (perhaps not to yet make sufficient sense in 
what I say) but to transfer knowledge from solving one problem to 
possibly solving many.


Thank you for your patience and tolerance,

Dr Paul St George
--
http://www.paulstgeorge.com
http://www.devices-of-wonder.com


--
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about Python syntax

2019-08-24 Thread Cameron Simpson

On 24Aug2019 21:52, Paul St George  wrote:

Have you not got one of these handed to you from something?

Or are you right at the outside with some "opaque" blender handle or 
something? (Disclaimer: I've never used Blender.)


Thank you once again.
If I understand your question, I am right outside. By this I mean I 
have not created anything with Python. I have made the Blender model 
with the UI and am trying to use Python to read the values for the 
settings used. This has worked for all settings except this Map Value 
Node.


Hmm. So you have a CompositorNodeMapValue instance? If that is the case 
you should be able to inspect it as previously described.


However, it looks like this is something you construct in order to do 
some task. A little web searching turns up this stackexchange post:


 
https://blender.stackexchange.com/questions/42579/render-depth-map-to-image-with-python-script/42667

and some example code from an unrelated project:

 
https://github.com/panmari/stanford-shapenet-renderer/blob/master/render_blender.py


From the stackexchange post:


   map = tree.nodes.new(type="CompositorNodeMapValue")
   # Size is chosen kind of arbitrarily, try out until you're satisfied 
   # with resulting depth map.

   map.size = [0.08]
   map.use_min = True
   map.min = [0]
   map.use_max = True
   map.max = [255]

"tree" is "bpy.context.scene.node_tree".

Aside from "map" being a poor name (it is also a builtin Python 
function), it seems that one creates one of these to control how some 
rendering process is done.


The class reference page you originally cites then specifies the meaning 
of the various attributes you might set on one of these objects.


Cheers,
Cameron Simpson 
--
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about Python syntax

2019-08-24 Thread Barry
Have you tried asking on a blender user mailing list for help with this problem?

It seems that someone familiar with blender and its python interface should be 
able to help get you going.

Barry

> On 24 Aug 2019, at 20:52, Paul St George  wrote:
> 
>> On 24/08/2019 01:23, Cameron Simpson wrote:
>>> On 23Aug2019 13:49, Paul St George  wrote:
>>> Context:
>>> I am using Python to interrogate the value of some thing in Blender (just 
>>> as someone else might want to use Python to look at an email in a Mail 
>>> program or an image in Photoshop).
>>> 
>>> Assumptions:
>>> So, I want to look at the attribute of an instance of a class called 
>>> CompositorNodeMapValue. The code in the Python tutorial seems to be for 
>>> creating an instance of a class, but I assume Blender (in my case) has 
>>> already created the instance that I want to interrogate.
>> That would be the expectation. And to interrogate it, you need that instance 
>> to hand in a variable.
>>> Question:
>>> If this is so, should I find the code to get a list of the instances that 
>>> have been made (maybe using inspect?) and then, when I have its name, the 
>>> attributes of the one that interests me?
>> Have you not got one of these handed to you from something?
>> Or are you right at the outside with some "opaque" blender handle or 
>> something? (Disclaimer: I've never used Blender.)
> 
> Thank you once again.
> If I understand your question, I am right outside. By this I mean I have not 
> created anything with Python. I have made the Blender model with the UI and 
> am trying to use Python to read the values for the settings used. This has 
> worked for all settings except this Map Value Node.
> 
>> You can inspect objects with the inspect module. You can also be more 
>> direct. Given an object "o", you can do an assortment of things:
> 
> Before I do any of the following, I assume I need to use something like:
> 
> import struct
> class CompositorNodeMapValue(o):
> 
> I have tried this. Nothing happens. Not even an error. It's like waiting for 
> Godot.
> 
> I am guessing I am in the wrong namespace.
> 
> I don't know whether it is relevant, but I tried plain
> dir()
> and
> dir(struct)
> 
> They each returned a list and neither list had mention of 
> CompositorNodeMapValue
> 
> If I do something like:
> o = CompositorNodeMapValue()
> I get:
> NameError: name 'CompositorNodeMapValue' is not defined
> 
>> dir(o) gets a list of its attribute names.
>> help(o) prints out the docstring, somewhat rendered.
>> o.__dict__ is usually a dict mapping attribute names to their values.
>> type(o) gets you its type, so "print(type(o))" or "print(type(o).__name__)" 
>> can be handy.
>> A crude probe function (untested):
>>  def probe(o):
>>print(o)
>>for attr, value in sorted(o.__dict__.items()):
>>  print(" ", attr, type(value).__name__, value)
>> Enjoy,
>> Cameron Simpson  (formerly c...@zip.com.au)
>> "Are we alpinists, or are we tourists" followed by "tourists! tourists!"
>>- Kobus Barnard  in rec.climbing,
>>  on things he's heard firsthand
> 
> 
> -- 
> https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about Python syntax

2019-08-24 Thread Paul St George

On 24/08/2019 01:23, Cameron Simpson wrote:

On 23Aug2019 13:49, Paul St George  wrote:

Context:
I am using Python to interrogate the value of some thing in Blender 
(just as someone else might want to use Python to look at an email in 
a Mail program or an image in Photoshop).


Assumptions:
So, I want to look at the attribute of an instance of a class called 
CompositorNodeMapValue. The code in the Python tutorial seems to be 
for creating an instance of a class, but I assume Blender (in my case) 
has already created the instance that I want to interrogate.


That would be the expectation. And to interrogate it, you need that 
instance to hand in a variable.



Question:
If this is so, should I find the code to get a list of the instances 
that have been made (maybe using inspect?) and then, when I have its 
name, the attributes of the one that interests me?


Have you not got one of these handed to you from something?

Or are you right at the outside with some "opaque" blender handle or 
something? (Disclaimer: I've never used Blender.)


Thank you once again.
If I understand your question, I am right outside. By this I mean I have 
not created anything with Python. I have made the Blender model with the 
UI and am trying to use Python to read the values for the settings used. 
This has worked for all settings except this Map Value Node.




You can inspect objects with the inspect module. You can also be more 
direct. Given an object "o", you can do an assortment of things:


Before I do any of the following, I assume I need to use something like:

import struct
class CompositorNodeMapValue(o):

I have tried this. Nothing happens. Not even an error. It's like waiting 
for Godot.


I am guessing I am in the wrong namespace.

I don't know whether it is relevant, but I tried plain
dir()
and
dir(struct)

They each returned a list and neither list had mention of 
CompositorNodeMapValue


If I do something like:
o = CompositorNodeMapValue()
I get:
NameError: name 'CompositorNodeMapValue' is not defined



dir(o) gets a list of its attribute names.

help(o) prints out the docstring, somewhat rendered.

o.__dict__ is usually a dict mapping attribute names to their values.

type(o) gets you its type, so "print(type(o))" or 
"print(type(o).__name__)" can be handy.


A crude probe function (untested):

  def probe(o):
    print(o)
    for attr, value in sorted(o.__dict__.items()):
  print(" ", attr, type(value).__name__, value)

Enjoy,
Cameron Simpson  (formerly c...@zip.com.au)

"Are we alpinists, or are we tourists" followed by "tourists! tourists!"
    - Kobus Barnard  in rec.climbing,
  on things he's heard firsthand



--
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about Python syntax

2019-08-23 Thread Cameron Simpson

On 23Aug2019 13:49, Paul St George  wrote:

Context:
I am using Python to interrogate the value of some thing in Blender 
(just as someone else might want to use Python to look at an email in 
a Mail program or an image in Photoshop).


Assumptions:
So, I want to look at the attribute of an instance of a class called 
CompositorNodeMapValue. The code in the Python tutorial seems to be 
for creating an instance of a class, but I assume Blender (in my case) 
has already created the instance that I want to interrogate.


That would be the expectation. And to interrogate it, you need that 
instance to hand in a variable.



Question:
If this is so, should I find the code to get a list of the instances 
that have been made (maybe using inspect?) and then, when I have its 
name, the attributes of the one that interests me?


Have you not got one of these handed to you from something?

Or are you right at the outside with some "opaque" blender handle or 
something? (Disclaimer: I've never used Blender.)


You can inspect objects with the inspect module. You can also be more 
direct. Given an object "o", you can do an assortment of things:


dir(o) gets a list of its attribute names.

help(o) prints out the docstring, somewhat rendered.

o.__dict__ is usually a dict mapping attribute names to their values.

type(o) gets you its type, so "print(type(o))" or 
"print(type(o).__name__)" can be handy.


A crude probe function (untested):

 def probe(o):
   print(o)
   for attr, value in sorted(o.__dict__.items()):
 print(" ", attr, type(value).__name__, value)

Enjoy,
Cameron Simpson  (formerly c...@zip.com.au)

"Are we alpinists, or are we tourists" followed by "tourists! tourists!"
   - Kobus Barnard  in rec.climbing,
 on things he's heard firsthand
--
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about Python syntax

2019-08-23 Thread Paul St George

On 22/08/2019 23:21, Kyle Stanley wrote:
[snip]


The tutorial that Terry was referring to was the one on docs.python.org,
here's a couple of links for the sections he was referring to:

Full section on classes: https://docs.python.org/3/tutorial/classes.html

Section on instantiating objects from classes:
https://docs.python.org/3/tutorial/classes.html#class-objects


[snip]



Aha, thank you all.
Here then, is my first problem.

Context:
I am using Python to interrogate the value of some thing in Blender 
(just as someone else might want to use Python to look at an email in a 
Mail program or an image in Photoshop).


Assumptions:
So, I want to look at the attribute of an instance of a class called 
CompositorNodeMapValue. The code in the Python tutorial seems to be for 
creating an instance of a class, but I assume Blender (in my case) has 
already created the instance that I want to interrogate.


Question:
If this is so, should I find the code to get a list of the instances 
that have been made (maybe using inspect?) and then, when I have its 
name, the attributes of the one that interests me?


Or shall I go into the garden to eat worms?

--
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about Python syntax

2019-08-22 Thread Kyle Stanley
> You are right, but it is even worse than you think. I do not have a
tutorial so I have no examples to understand.

The tutorial that Terry was referring to was the one on docs.python.org,
here's a couple of links for the sections he was referring to:

Full section on classes: https://docs.python.org/3/tutorial/classes.html

Section on instantiating objects from classes:
https://docs.python.org/3/tutorial/classes.html#class-objects

On Thu, Aug 22, 2019 at 4:40 PM Paul St George 
wrote:

> On 22/08/2019 20:02, Terry Reedy wrote:
> > On 8/22/2019 3:34 AM, Paul St George wrote:
> >> I have the Python API for the Map Value Node here:
> >> <
> https://docs.blender.org/api/current/bpy.types.CompositorNodeMapValue.html>.
>
> >>
> >>
> >> All well and good. Now I just want to write a simple line of code such
> >> as:
> >>
> >> import bpy
> >>
> >> ...
> >>
> >>  >>>print(bpy.types.CompositorNodeMapValue.max[0])
> >>
> >> If this works, I will do something similar for max, min, offset and
> >> then size.
> >
> >  From this and your other responses, you seem to not understand some of
> > the concepts explained in the tutorial, in particular class and class
> > instance.  Perhaps you should reread the appropriate section(s), and if
> > you don't understand any of the examples, ask about them here.  We are
> > familiar with those, but not with CompositorNodeMapValue.
> >
> >
> Terry,
> You are right, but it is even worse than you think. I do not have a
> tutorial so I have no examples to understand.
>
> Reading Cameron et al, I have broken the problem down into:
> do something (probably using the word self) that _gives_ me an instance
> of CompositorNodeMapValue.
>
> Then when I done that,
> look at some of the attributes (.max, .min, .offset, .size) of the
> instance.
>
> Paul
>
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about Python syntax

2019-08-22 Thread Paul St George

On 22/08/2019 20:02, Terry Reedy wrote:

On 8/22/2019 3:34 AM, Paul St George wrote:
I have the Python API for the Map Value Node here: 
. 



All well and good. Now I just want to write a simple line of code such 
as:


import bpy

...

 >>>print(bpy.types.CompositorNodeMapValue.max[0])

If this works, I will do something similar for max, min, offset and 
then size.


 From this and your other responses, you seem to not understand some of 
the concepts explained in the tutorial, in particular class and class 
instance.  Perhaps you should reread the appropriate section(s), and if 
you don't understand any of the examples, ask about them here.  We are 
familiar with those, but not with CompositorNodeMapValue.




Terry,
You are right, but it is even worse than you think. I do not have a 
tutorial so I have no examples to understand.


Reading Cameron et al, I have broken the problem down into:
do something (probably using the word self) that _gives_ me an instance 
of CompositorNodeMapValue.


Then when I done that,
look at some of the attributes (.max, .min, .offset, .size) of the instance.

Paul

--
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about Python syntax

2019-08-22 Thread Terry Reedy

On 8/22/2019 3:34 AM, Paul St George wrote:
I have the Python API for the Map Value Node here: 
. 



All well and good. Now I just want to write a simple line of code such as:

import bpy

...

 >>>print(bpy.types.CompositorNodeMapValue.max[0])

If this works, I will do something similar for max, min, offset and then 
size.


From this and your other responses, you seem to not undertstand some of 
the concepts explained in the tutorial, in particular class and class 
instance.  Perhaps you should reread the appropriate section(s), and if 
you don't understand any of the examples, ask about them here.  We are 
familiar with those, but not with CompositorNodeMapValue.



--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about Python syntax

2019-08-22 Thread Chris Angelico
On Thu, Aug 22, 2019 at 9:20 PM Paul St George  wrote:
>
> On 22/08/2019 11:49, Cameron Simpson wrote:
> > On 22Aug2019 09:34, Paul St George  wrote:
> >> I have the Python API for the Map Value Node here:
> >> .
> >>
> >>
> >> All well and good. Now I just want to write a simple line of code such
> >> as:
> >>
> >> import bpy
> > print(bpy.types.CompositorNodeMapValue.max[0])
> > [...]
> >> AttributeError: type object 'CompositorNodeMapValue' has no attribute
> >> 'max'
> >
> > CompositorNodeMapValue is a class. All the attributes described are for
> > instances of the class. So you need to do something that _gives_ you an
> > instance of CompositorNodeMapValue. That instance should have a .max array.
> >
> > Cheers,
> > Cameron Simpson 
>
> Gulp. Thank you. I did ask for a pointer but perhaps I need a roadmap.
>
> I have tried to do something that gives me an instance of
> CompositorNodeMapValue. I don't think I should humiliate myself further
> by sharing my attempts so could you please show me what the code should
> look like.
>

Don't think of it as humiliating yourself - you're asking for fairly
specific advice, so showing your code is the best way to get that sort
of advice. We don't bite :)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about Python syntax

2019-08-22 Thread Paul St George

On 22/08/2019 11:49, Cameron Simpson wrote:

On 22Aug2019 09:34, Paul St George  wrote:
I have the Python API for the Map Value Node here: 
. 



All well and good. Now I just want to write a simple line of code such 
as:


import bpy

print(bpy.types.CompositorNodeMapValue.max[0])

[...]
AttributeError: type object 'CompositorNodeMapValue' has no attribute 
'max'


CompositorNodeMapValue is a class. All the attributes described are for 
instances of the class. So you need to do something that _gives_ you an 
instance of CompositorNodeMapValue. That instance should have a .max array.


Cheers,
Cameron Simpson 


Gulp. Thank you. I did ask for a pointer but perhaps I need a roadmap.

I have tried to do something that gives me an instance of 
CompositorNodeMapValue. I don't think I should humiliate myself further 
by sharing my attempts so could you please show me what the code should 
look like.


--
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about Python syntax

2019-08-22 Thread Cameron Simpson

On 22Aug2019 09:34, Paul St George  wrote:

I have the Python API for the Map Value Node here: 
.

All well and good. Now I just want to write a simple line of code such as:

import bpy

print(bpy.types.CompositorNodeMapValue.max[0])

[...]
AttributeError: type object 'CompositorNodeMapValue' has no attribute 
'max'


CompositorNodeMapValue is a class. All the attributes described are for 
instances of the class. So you need to do something that _gives_ you an 
instance of CompositorNodeMapValue. That instance should have a .max 
array.


Cheers,
Cameron Simpson 
--
https://mail.python.org/mailman/listinfo/python-list


Newbie question about Python syntax

2019-08-22 Thread Paul St George
I have the Python API for the Map Value Node here: 
.


All well and good. Now I just want to write a simple line of code such as:

import bpy

...

>>>print(bpy.types.CompositorNodeMapValue.max[0])

If this works, I will do something similar for max, min, offset and then 
size.


I know my embarrassingly feeble attempt is wrong because the console 
tells me:

AttributeError: type object 'CompositorNodeMapValue' has no attribute 'max'

Could anyone (please) point me in the right direction?

Thanks,
Paul
--
https://mail.python.org/mailman/listinfo/python-list


Re: newbie question

2019-08-01 Thread Sidney Langweil
On Thursday, August 1, 2019 at 7:57:31 AM UTC-7, Calvin Spealman wrote:
> Sorry, but you can't. If you have two python modules, neither has access to
> things in the other without an import.
> 
> That's the whole point of an import.
> 
> On Thu, Aug 1, 2019 at 10:30 AM Sidney Langweil 
> wrote:
> 
> > A Python script invokes a function in another file in the same directory.
> >
> > I would like to invoke that function without the need for an import.
> >
> > I think I read that having an empty __init__.py is sufficient.  But it
> > does not seem to work for me.
> >
> > I'm sure this is obvious to many of you.  Thanks in advance for your help.
> >
> > Sid
> > --
> > https://mail.python.org/mailman/listinfo/python-list
> >
> 
> 
> -- 
> 
> CALVIN SPEALMAN
> 
> SENIOR QUALITY ENGINEER
> 
> cspea...@redhat.com  M: +1.336.210.5107
> [image: https://red.ht/sig] 
> TRIED. TESTED. TRUSTED. 

Thank you.  As long as using 'import' is correct in this case, I do not mind 
inserting the extra line(s).

Side
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: newbie question

2019-08-01 Thread Calvin Spealman
Sorry, but you can't. If you have two python modules, neither has access to
things in the other without an import.

That's the whole point of an import.

On Thu, Aug 1, 2019 at 10:30 AM Sidney Langweil 
wrote:

> A Python script invokes a function in another file in the same directory.
>
> I would like to invoke that function without the need for an import.
>
> I think I read that having an empty __init__.py is sufficient.  But it
> does not seem to work for me.
>
> I'm sure this is obvious to many of you.  Thanks in advance for your help.
>
> Sid
> --
> https://mail.python.org/mailman/listinfo/python-list
>


-- 

CALVIN SPEALMAN

SENIOR QUALITY ENGINEER

cspea...@redhat.com  M: +1.336.210.5107
[image: https://red.ht/sig] 
TRIED. TESTED. TRUSTED. 
-- 
https://mail.python.org/mailman/listinfo/python-list


newbie question

2019-08-01 Thread Sidney Langweil
A Python script invokes a function in another file in the same directory.

I would like to invoke that function without the need for an import.

I think I read that having an empty __init__.py is sufficient.  But it does not 
seem to work for me.

I'm sure this is obvious to many of you.  Thanks in advance for your help.

Sid
-- 
https://mail.python.org/mailman/listinfo/python-list


tix.FileSelectBox causes crash: was A newbie question about using tix

2019-05-03 Thread David Sumbler


On Wed, 2019-05-01 at 19:11 +0100, MRAB wrote:
> On 2019-05-01 17:44, David Sumbler wrote:
>  >
>  > On Tue, 2019-04-30 at 20:46 +0100, MRAB wrote:
...
>  > > For some reason, tix widgets don't work with normal tkinter
> widgets,
>  > > so
>  > > you can't put a tix FileSelectBox on a tkinter.Tk widget.
>  > >
>  > > There is, however, a tix.Tk widget that you can use instead:
>  > >
>  > > import tkinter.tix as tix
>  > > root = tix.Tk()
>  > > f = tix.FileSelectBox(root)
>  > > f.pack()
>  >
>  > Thanks for that.
>  >
>  > When I ran the above, I got:
>  >
>  >  Traceback (most recent call last):
>  >File "/home/david/bin/GradientProfile_v2.py", line 2, in
> 
>  >  root = tix.Tk()
>  >File "/usr/lib/python3.6/tkinter/tix.py", line 214, in
> __init__
>  >  self.tk.eval('package require Tix')
>  >  _tkinter.TclError: can't find package Tix
>  >
>  > After an internet search, I tried:
>  >
>  > sudo apt install tix-dev tk-dev tk8.6-dev libxft-dev 
> libfontconfig1-dev libfreetype6-dev libpng-dev
>  >
>  > Now when I run the file the program just exits quickly, with no
>  > reported errors, but no window(s).  If I add 'root.mainloop()' at
> the
>  > end, I get an empty root window for a fraction of a second, then
> the
>  > program exits with:
>  >
>  >  Segmentation fault (core dumped)
>  >
>  > Any suggestions as to where to go from here?
>  >
> Tested on Raspbian in a terminal window:
> 
> sudo apt-get install tix
> 
> python3
> 
> import tkinter.tix as tix
> root = tix.Tk()
> f = tix.FileSelectBox(root)
> f.pack()
> 
> At this point there's a GUi window filled with a file section box.

When I enter the above lines in python 3.6.7 on my desktop computer,
running Ubuntu 18.04, the final line causes Python to crash with the
message:

 Segmentation fault (core dumped)

I also tried this in python 2.7.15rc1 with the same result.

I then tried it on my HP laptop computer, also running Ubuntu 18.04. 
Again, it produced a segmentation fault.

However, on my partner's computer, which is running Ubuntu 16.04, the
code works as it should.

It looks as if perhaps there is some bug in Ubuntu 18.04 which, so far
as my experience goes, only manifests itself when using tix in Python
(either version 2 or 3).  I have run most of the files in Mark Lutz's
Programming Python (including most of the tkinter chapters) without any
problem.

I can find no reference to a bug of this sort when doing an internet
search, so perhaps there could be some other explanation.

Does anyone else see the same crash when trying to use
tix.FileSelectionBox?

David
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: A newbie question about using tix

2019-05-01 Thread MRAB

On 2019-05-01 17:44, David Sumbler wrote:
>
> On Tue, 2019-04-30 at 20:46 +0100, MRAB wrote:
> > On 2019-04-30 16:40, David Sumbler wrote:
> > > Running Ubuntu 18.04, Python 3.6.7, tkinter 8.6
> > >
> > > I am very new to tkinter.  The simple program I am writing requires
> > > a
> > > user file to be selected before it does anything else, so I would
> > > like
> > > a file selection dialog in the main window as soon as the program
> > > launches.
> > >
> > > Tkinter only has askopenfilename(), but this produces a popup
> > > dialog.
> > > I can get something like what I want by specifying a small Tk()
> > > window
> > > and then calling askopenfilename() so that it covers the root
> > > window.
> > >
> > > It's not ideal, though.  From the documentation I thought that the
> > > tix
> > > FileSelectBox would do what I wanted, but I just can't get it to
> > > work.
> > >
> > > If I run:
> > >
> > >   from tkinter import *
> > >   from tkinter.tix import FileSelectBox
> > >   root = Tk()
> > >   f = FileSelectBox(root)
> > >
> > > I get the following:
> > >
> > >   Traceback (most recent call last):
> > > File "/home/david/bin/Gradient.py", line 4, in 
> > >   f = FileSelectBox(root)
> > > File "/usr/lib/python3.6/tkinter/tix.py", line 795, in
> > > __init__
> > >   TixWidget.__init__(self, master, 'tixFileSelectBox',
> > > ['options'], cnf, kw)
> > > File "/usr/lib/python3.6/tkinter/tix.py", line 311, in
> > > __init__
> > >   self.tk.call(widgetName, self._w, *extra)
> > >   _tkinter.TclError: invalid command name "tixFileSelectBox"
> > >
> > > I realize that assigning the value of FileSelectBox() isn't going
> > > to
> > > give me a filename: I'm just trying to get the basic syntax right
> > > at
> > > the moment.
> > >
> > > I can't figure out what is wrong though.  Have I have misunderstood
> > > how
> > > it should be called, or is there something missing from my system?
> > >
> >
> > For some reason, tix widgets don't work with normal tkinter widgets,
> > so
> > you can't put a tix FileSelectBox on a tkinter.Tk widget.
> >
> > There is, however, a tix.Tk widget that you can use instead:
> >
> > import tkinter.tix as tix
> > root = tix.Tk()
> > f = tix.FileSelectBox(root)
> > f.pack()
>
> Thanks for that.
>
> When I ran the above, I got:
>
>  Traceback (most recent call last):
>    File "/home/david/bin/GradientProfile_v2.py", line 2, in 
>  root = tix.Tk()
>    File "/usr/lib/python3.6/tkinter/tix.py", line 214, in __init__
>  self.tk.eval('package require Tix')
>  _tkinter.TclError: can't find package Tix
>
> After an internet search, I tried:
>
> sudo apt install tix-dev tk-dev tk8.6-dev libxft-dev 
libfontconfig1-dev libfreetype6-dev libpng-dev

>
> Now when I run the file the program just exits quickly, with no
> reported errors, but no window(s).  If I add 'root.mainloop()' at the
> end, I get an empty root window for a fraction of a second, then the
> program exits with:
>
>  Segmentation fault (core dumped)
>
> Any suggestions as to where to go from here?
>
Tested on Raspbian in a terminal window:

sudo apt-get install tix

python3

import tkinter.tix as tix
root = tix.Tk()
f = tix.FileSelectBox(root)
f.pack()

At this point there's a GUi window filled with a file section box.

--
https://mail.python.org/mailman/listinfo/python-list


Re: A newbie question about using tix

2019-05-01 Thread David Sumbler


On Tue, 2019-04-30 at 20:46 +0100, MRAB wrote:
> On 2019-04-30 16:40, David Sumbler wrote:
> > Running Ubuntu 18.04, Python 3.6.7, tkinter 8.6
> > 
> > I am very new to tkinter.  The simple program I am writing requires
> > a
> > user file to be selected before it does anything else, so I would
> > like
> > a file selection dialog in the main window as soon as the program
> > launches.
> > 
> > Tkinter only has askopenfilename(), but this produces a popup
> > dialog.
> > I can get something like what I want by specifying a small Tk()
> > window
> > and then calling askopenfilename() so that it covers the root
> > window.
> > 
> > It's not ideal, though.  From the documentation I thought that the
> > tix
> > FileSelectBox would do what I wanted, but I just can't get it to
> > work.
> > 
> > If I run:
> > 
> >   from tkinter import *
> >   from tkinter.tix import FileSelectBox
> >   root = Tk()
> >   f = FileSelectBox(root)
> > 
> > I get the following:
> > 
> >   Traceback (most recent call last):
> > File "/home/david/bin/Gradient.py", line 4, in 
> >   f = FileSelectBox(root)
> > File "/usr/lib/python3.6/tkinter/tix.py", line 795, in
> > __init__
> >   TixWidget.__init__(self, master, 'tixFileSelectBox',
> > ['options'], cnf, kw)
> > File "/usr/lib/python3.6/tkinter/tix.py", line 311, in
> > __init__
> >   self.tk.call(widgetName, self._w, *extra)
> >   _tkinter.TclError: invalid command name "tixFileSelectBox"
> > 
> > I realize that assigning the value of FileSelectBox() isn't going
> > to
> > give me a filename: I'm just trying to get the basic syntax right
> > at
> > the moment.
> > 
> > I can't figure out what is wrong though.  Have I have misunderstood
> > how
> > it should be called, or is there something missing from my system?
> > 
> 
> For some reason, tix widgets don't work with normal tkinter widgets,
> so 
> you can't put a tix FileSelectBox on a tkinter.Tk widget.
> 
> There is, however, a tix.Tk widget that you can use instead:
> 
> import tkinter.tix as tix
> root = tix.Tk()
> f = tix.FileSelectBox(root)
> f.pack()

Thanks for that.

When I ran the above, I got:

 Traceback (most recent call last):
   File "/home/david/bin/GradientProfile_v2.py", line 2, in 
 root = tix.Tk()
   File "/usr/lib/python3.6/tkinter/tix.py", line 214, in __init__
 self.tk.eval('package require Tix')
 _tkinter.TclError: can't find package Tix

After an internet search, I tried:

sudo apt install tix-dev tk-dev tk8.6-dev libxft-dev libfontconfig1-dev 
libfreetype6-dev libpng-dev

Now when I run the file the program just exits quickly, with no
reported errors, but no window(s).  If I add 'root.mainloop()' at the
end, I get an empty root window for a fraction of a second, then the
program exits with:

 Segmentation fault (core dumped)

Any suggestions as to where to go from here?

David

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: A newbie question about using tix

2019-04-30 Thread MRAB

On 2019-04-30 16:40, David Sumbler wrote:

Running Ubuntu 18.04, Python 3.6.7, tkinter 8.6

I am very new to tkinter.  The simple program I am writing requires a
user file to be selected before it does anything else, so I would like
a file selection dialog in the main window as soon as the program
launches.

Tkinter only has askopenfilename(), but this produces a popup dialog.
I can get something like what I want by specifying a small Tk() window
and then calling askopenfilename() so that it covers the root window.

It's not ideal, though.  From the documentation I thought that the tix
FileSelectBox would do what I wanted, but I just can't get it to work.

If I run:

  from tkinter import *
  from tkinter.tix import FileSelectBox
  root = Tk()
  f = FileSelectBox(root)

I get the following:

  Traceback (most recent call last):
File "/home/david/bin/Gradient.py", line 4, in 
  f = FileSelectBox(root)
File "/usr/lib/python3.6/tkinter/tix.py", line 795, in __init__
  TixWidget.__init__(self, master, 'tixFileSelectBox', ['options'], 
cnf, kw)
File "/usr/lib/python3.6/tkinter/tix.py", line 311, in __init__
  self.tk.call(widgetName, self._w, *extra)
  _tkinter.TclError: invalid command name "tixFileSelectBox"

I realize that assigning the value of FileSelectBox() isn't going to
give me a filename: I'm just trying to get the basic syntax right at
the moment.

I can't figure out what is wrong though.  Have I have misunderstood how
it should be called, or is there something missing from my system?

For some reason, tix widgets don't work with normal tkinter widgets, so 
you can't put a tix FileSelectBox on a tkinter.Tk widget.


There is, however, a tix.Tk widget that you can use instead:

import tkinter.tix as tix
root = tix.Tk()
f = tix.FileSelectBox(root)
f.pack()
--
https://mail.python.org/mailman/listinfo/python-list


A newbie question about using tix

2019-04-30 Thread David Sumbler
Running Ubuntu 18.04, Python 3.6.7, tkinter 8.6

I am very new to tkinter.  The simple program I am writing requires a
user file to be selected before it does anything else, so I would like
a file selection dialog in the main window as soon as the program
launches.

Tkinter only has askopenfilename(), but this produces a popup dialog. 
I can get something like what I want by specifying a small Tk() window
and then calling askopenfilename() so that it covers the root window.

It's not ideal, though.  From the documentation I thought that the tix
FileSelectBox would do what I wanted, but I just can't get it to work.

If I run:

 from tkinter import *
 from tkinter.tix import FileSelectBox
 root = Tk()
 f = FileSelectBox(root)

I get the following:

 Traceback (most recent call last):
   File "/home/david/bin/Gradient.py", line 4, in 
 f = FileSelectBox(root)
   File "/usr/lib/python3.6/tkinter/tix.py", line 795, in __init__
 TixWidget.__init__(self, master, 'tixFileSelectBox', ['options'], cnf, 
kw)
   File "/usr/lib/python3.6/tkinter/tix.py", line 311, in __init__
 self.tk.call(widgetName, self._w, *extra)
 _tkinter.TclError: invalid command name "tixFileSelectBox"

I realize that assigning the value of FileSelectBox() isn't going to
give me a filename: I'm just trying to get the basic syntax right at
the moment.

I can't figure out what is wrong though.  Have I have misunderstood how
it should be called, or is there something missing from my system?

David


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: TKinter Newbie question

2019-01-18 Thread TUA
Thanks for your fresh pair of eyes!
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: TKinter Newbie question

2019-01-17 Thread Peter Otten
TUA wrote:

> Why does the button frame in the code below not show?

> tk.Button(bf, padx = 10, relief = tk.GROOVE, text = 'Help')

You forgot to layout it with .pack().

> I intend to have it displayed in between the notebook at the top and the
> fake statusbar at the bottom.

I think for that you need to swap the sb.pack() and bf.pack() calls, or 
remove side=BOTTOM from bf.pack().


-- 
https://mail.python.org/mailman/listinfo/python-list


TKinter Newbie question

2019-01-17 Thread TUA
Why does the button frame in the code below not show?

I intend to have it displayed in between the notebook at the top and the fake 
statusbar at the bottom.

Thanks for any help!


from tkinter import ttk
import tkinter as tk

class MainForm():

def __init__(self, master):

self.master = master
self.master.title('Test')

nb = ttk.Notebook(self.master)

page_1 = ttk.Frame(nb)

nframe = ttk.LabelFrame(page_1, text='Frame', padding = 10)
tk.Label(nframe, padx = 10, pady = 5, text = 'Name').pack()
tk.Entry(nframe, width = 30).pack()
tk.Label(nframe, padx = 10, pady = 5, text = 'City').pack()
tk.Entry(nframe, width = 30).pack()
#
nframe.pack(fill="both", expand="yes", padx = 10, pady = 10)  # pad 
around the frame

nb.add(page_1, text = 'Tab #1')

nb.pack(expand = True, fill = "both")


#---
# button frame for Help button   why does it not show? 


#---
bf = ttk.Frame(self.master, relief = tk.SUNKEN)
tk.Button(bf, padx = 10, relief = tk.GROOVE, text = 'Help')
bf.pack(side = tk.BOTTOM, fill = tk.X)


#---
# fake a status bar from a label

#---
sb = tk.Label(self.master, text = ' Waiting for rain ...', bd = 1, 
anchor = tk.W, relief = tk.SUNKEN)
sb.pack(side = tk.BOTTOM, fill = tk.X)

def CloseApplication(self):
self.master.destroy()

def StartApplication():
root = tk.Tk()

MainForm(root)
root.mainloop()

if __name__ == '__main__':
StartApplication()
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: RE newbie question

2018-04-18 Thread Steven D'Aprano
On Wed, 18 Apr 2018 12:37:29 -0700, TUA wrote:

> My intention is to implement a max. length of 8 for an input string. The
> above works well in all other respects, but does allow for strings that
> are too long.

if len(input_string) > 8:
raise ValueError('string is too long')



-- 
Steve

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: RE newbie question

2018-04-18 Thread Albert-Jan Roskam

On Apr 18, 2018 21:42, TUA  wrote:
>
> import re
>
> compval = 'A123456_8'
> regex = '[a-zA-Z]\w{0,7}'
>
> if re.match(regex, compval):
>print('Yes')
> else:
>print('No')
>
>
> My intention is to implement a max. length of 8 for an input string. The 
> above works well in all other respects, but does allow for strings that are 
> too long.
>
> What is the proper way to fix this?

Use a $ sign at the end of the regex
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: RE newbie question

2018-04-18 Thread TUA
Thanks much!
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: RE newbie question

2018-04-18 Thread Ian Kelly
On Wed, Apr 18, 2018 at 1:57 PM, Rob Gaddi
 wrote:
> On 04/18/2018 12:37 PM, TUA wrote:
>>
>> import re
>>
>> compval = 'A123456_8'
>> regex = '[a-zA-Z]\w{0,7}'
>>
>> if re.match(regex, compval):
>> print('Yes')
>> else:
>> print('No')
>>
>>
>> My intention is to implement a max. length of 8 for an input string. The
>> above works well in all other respects, but does allow for strings that are
>> too long.
>>
>> What is the proper way to fix this?
>>
>> Thanks for any help!
>>
>
> You could put the end marker $ on your regex so that it won't match if it's
> not the end.
>
> Or, you know, you could just check len(compval) <= 8 and not get bogged down
> in regexes.  They tend to be excellent solutions to only a very specific
> complexity of problem.

In Python 3.4+ you can also use re.fullmatch which requires a match
against the entire string.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: RE newbie question

2018-04-18 Thread Rob Gaddi

On 04/18/2018 12:37 PM, TUA wrote:

import re

compval = 'A123456_8'
regex = '[a-zA-Z]\w{0,7}'

if re.match(regex, compval):
print('Yes')
else:
print('No')


My intention is to implement a max. length of 8 for an input string. The above 
works well in all other respects, but does allow for strings that are too long.

What is the proper way to fix this?

Thanks for any help!



You could put the end marker $ on your regex so that it won't match if 
it's not the end.


Or, you know, you could just check len(compval) <= 8 and not get bogged 
down in regexes.  They tend to be excellent solutions to only a very 
specific complexity of problem.


--
Rob Gaddi, Highland Technology -- www.highlandtechnology.com
Email address domain is currently out of order.  See above to fix.
--
https://mail.python.org/mailman/listinfo/python-list


RE newbie question

2018-04-18 Thread TUA
import re

compval = 'A123456_8'
regex = '[a-zA-Z]\w{0,7}'

if re.match(regex, compval):
   print('Yes')
else:
   print('No')  


My intention is to implement a max. length of 8 for an input string. The above 
works well in all other respects, but does allow for strings that are too long.

What is the proper way to fix this?

Thanks for any help!
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ARGPARSE Newbie question

2018-04-17 Thread TUA
Thanks for the pointers!
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ARGPARSE Newbie question

2018-04-17 Thread paulclarke345
On Tuesday, April 17, 2018 at 7:09:45 PM UTC-5, TUA wrote:
> I'd like to create a script that handles a number of verbs with mandatory and 
> /or optional parameters like listed in the table below.
> 
> Can ARGPARSE do this and how?
> 
> Thanks for all help!
> 
> 
> 
> 
> 
> Script  VerbMandatory parameters Optional 
> parameters 
> --
> myprog.py   list---  verbose 
> 
> myprog.py   add sid(string), type (string), memory (int) comment 
> (string), autostart (bool, default=TRUE)
> 
> myprog.py   memory  sid (string), memory (integer)
> 
> myprog.py   comment sid(string), comment (string)
> 
> myprog.py   restore sid(string), srcpath (string)
> 
> myprog.py   backup  sid(string), dstpath(string) 
> 
> myprog.py   remove  sid (string)

you can use subparsers for this. The syntax goes something like this:

parser = argparse.ArgumentParser()
subparsers = parser.add_subparsers(dest='subparser_name')
list_parser = subparsers.add_parser("list", help="help for list")
list_parse.add_argument("-v", "--verbose", help="show verbose output",
  action="store_true")
add_parser = subparsers.add_parser("add", help="help for add")
add.add_argument("sid", type=str, help="help for sid")
...
etc. see the documentation on argparse for more on this.
-- 
https://mail.python.org/mailman/listinfo/python-list


ARGPARSE Newbie question

2018-04-17 Thread TUA
I'd like to create a script that handles a number of verbs with mandatory and 
/or optional parameters like listed in the table below.

Can ARGPARSE do this and how?

Thanks for all help!





Script  VerbMandatory parameters Optional 
parameters 
--
myprog.py   list---  verbose 

myprog.py   add sid(string), type (string), memory (int) comment 
(string), autostart (bool, default=TRUE)

myprog.py   memory  sid (string), memory (integer)

myprog.py   comment sid(string), comment (string)

myprog.py   restore sid(string), srcpath (string)

myprog.py   backup  sid(string), dstpath(string) 

myprog.py   remove  sid (string)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: newbie question re classes and self

2017-03-29 Thread Rick Johnson
On Tuesday, March 28, 2017 at 3:09:45 AM UTC-5, loial wrote:
> Can I pass self(or all its variables) to a class?
> Basically, how do I make all the variables defined in self
> in the calling python script available to the python class
> I want to call?

Your question, as presented, is difficult to understand, and
the phrase "variables defined in self", is quite absurd.

I'm making a wild assumption here, but perhaps you want to
"bulk-set" or "bulk-query" all the attributes of a class
instance externally? If so, depending on how the object was
defined, there are a few ways to achieve this.

However, my advanced powers of perception tell me that you
might be using Python in an incorrect manner, but i cannot
be sure until you explain this problem in more detail. So if
you can provide us a simple code example, or even psuedo
code, that would be very helpful.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: newbie question re classes and self

2017-03-28 Thread Terry Reedy

On 3/28/2017 4:09 AM, loial wrote:

Can I pass self(or all its variables) to a class?


In Python, every argument to every function is an instance of some 
class.  The function can access any attribute of the arguments it 
receives with arg.attribute.


--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list


Re: newbie question re classes and self

2017-03-28 Thread Peter Otten
loial wrote:

> Can I pass self(or all its variables) to a class?
> 
> Basically, how do I make all the variables defined in self in the calling
> python script available to the python class I want to call?

Inside a method you can access attributes of an instance as self.whatever:

>>> class A:
... def foo(self):
... self.bar = 42
... def baz(self):
... print(self.bar)
... 
>>> a = A()
>>> a.baz() # bar attribute not yet set
Traceback (most recent call last):
  File "", line 1, in 
  File "", line 5, in baz
AttributeError: 'A' object has no attribute 'bar'
>>> a.foo() # sets bar
>>> a.baz() # successfully print the newly created bar attribute
42

The class itself has no access to bar. In the rare case where you want to 
share data between instances you can use a class attribute:

>>> class B:
... bar = 42
... 
>>> x = B()
>>> y = B()
>>> x.bar
42
>>> y.bar
42

If neither is what you want please give a concrete example or a more 
detailed plain-english description.

-- 
https://mail.python.org/mailman/listinfo/python-list


newbie question re classes and self

2017-03-28 Thread loial
Can I pass self(or all its variables) to a class?

Basically, how do I make all the variables defined in self in the calling 
python script available to the python class I want to call?



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: newbie question

2016-03-29 Thread Sven R. Kunze



On 28.03.2016 17:34, ast wrote:


"Matt Wheeler"  a écrit dans le message de 
news:mailman.92.1458825746.2244.python-l...@python.org...

On Thu, 24 Mar 2016 11:10 Sven R. Kunze,  wrote:


On 24.03.2016 11:57, Matt Wheeler wrote:
 import ast
 s = "(1, 2, 3, 4)"
 t = ast.literal_eval(s)
 t
> (1, 2, 3, 4)

I suppose that's the better solution in terms of safety.



It has the added advantage that the enquirer gets to import a module 
that

shares their name ;)



I had a look at that "ast" module doc, but I must admit that
I didn't understood a lot of things.


If there were a module "srkunze", I think, I would be equally surprised. ;)

Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list


Re: newbie question

2016-03-28 Thread ast


"Matt Wheeler"  a écrit dans le message de 
news:mailman.92.1458825746.2244.python-l...@python.org...

On Thu, 24 Mar 2016 11:10 Sven R. Kunze,  wrote:


On 24.03.2016 11:57, Matt Wheeler wrote:
 import ast
 s = "(1, 2, 3, 4)"
 t = ast.literal_eval(s)
 t
> (1, 2, 3, 4)

I suppose that's the better solution in terms of safety.



It has the added advantage that the enquirer gets to import a module that
shares their name ;)



I had a look at that "ast" module doc, but I must admit that
I didn't understood a lot of things. 


--
https://mail.python.org/mailman/listinfo/python-list


Re: newbie question

2016-03-24 Thread Sven R. Kunze

On 24.03.2016 14:22, Matt Wheeler wrote:

On Thu, 24 Mar 2016 11:10 Sven R. Kunze,  wrote:


On 24.03.2016 11:57, Matt Wheeler wrote:

import ast
s = "(1, 2, 3, 4)"
t = ast.literal_eval(s)
t

(1, 2, 3, 4)

I suppose that's the better solution in terms of safety.


It has the added advantage that the enquirer gets to import a module that
shares their name ;)


One shouldn't underestimate this. ;-)
--
https://mail.python.org/mailman/listinfo/python-list


Re: newbie question

2016-03-24 Thread Grant Edwards
On 2016-03-24, Steven D'Aprano  wrote:
> On Thu, 24 Mar 2016 09:49 pm, David Palao wrote:
>
>> Hi,
>> Use "eval":
>> s = "(1, 2, 3, 4)"
>> t = eval(s)
>
> Don't use eval unless you absolutely, categorically, 100% trust the source
> of the string.

And then still don't use it. :)

eval is only safe if you're passing it a literal string containing
nothing but a literal constant expression -- in which case the eval is
superflous.

OK, I admit I've used it for quick hacks on occasion.  But, I
shouldn't have.

-- 
Grant

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: newbie question

2016-03-24 Thread Steven D'Aprano
On Thu, 24 Mar 2016 09:49 pm, David Palao wrote:

> Hi,
> Use "eval":
> s = "(1, 2, 3, 4)"
> t = eval(s)

Don't use eval unless you absolutely, categorically, 100% trust the source
of the string.

Otherwise, you are letting the person who provided the string run any code
they like on your computer. You want malware? That's how you get malware.




-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: newbie question

2016-03-24 Thread Matt Wheeler
On Thu, 24 Mar 2016 11:10 Sven R. Kunze,  wrote:

> On 24.03.2016 11:57, Matt Wheeler wrote:
>  import ast
>  s = "(1, 2, 3, 4)"
>  t = ast.literal_eval(s)
>  t
> > (1, 2, 3, 4)
>
> I suppose that's the better solution in terms of safety.
>

It has the added advantage that the enquirer gets to import a module that
shares their name ;)

>
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: newbie question

2016-03-24 Thread Steven D'Aprano
On Thu, 24 Mar 2016 09:39 pm, ast wrote:

> Hi
> 
> I have a string which contains a tupe, eg:
> 
> s = "(1, 2, 3, 4)"
> 
> and I want to recover the tuple in a variable t
> 
> t = (1, 2, 3, 4)
> 
> how would you do ?


py> import ast
py> ast.literal_eval("(1, 2, 3, 4)")
(1, 2, 3, 4)


-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: newbie question

2016-03-24 Thread Sven R. Kunze

On 24.03.2016 11:57, Matt Wheeler wrote:

import ast
s = "(1, 2, 3, 4)"
t = ast.literal_eval(s)
t

(1, 2, 3, 4)


I suppose that's the better solution in terms of safety.
--
https://mail.python.org/mailman/listinfo/python-list


Re: newbie question

2016-03-24 Thread Tim Chase
On 2016-03-24 11:49, David Palao wrote:
>> s = "(1, 2, 3, 4)"
>>
>> and I want to recover the tuple in a variable t
>>
>> t = (1, 2, 3, 4)
>>
>> how would you do ?
>
> Use "eval":
> s = "(1, 2, 3, 4)"
> t = eval(s)

Using eval() has security implications. Use ast.literal_eval for
safety instead:

  import ast
  s = "(1, 2, 3, 4)"
  t = ast.literal_eval(s)

-tkc


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: newbie question

2016-03-24 Thread Matt Wheeler
>>> import ast
>>> s = "(1, 2, 3, 4)"
>>> t = ast.literal_eval(s)
>>> t
(1, 2, 3, 4)

On 24 March 2016 at 10:39, ast  wrote:
> Hi
>
> I have a string which contains a tupe, eg:
>
> s = "(1, 2, 3, 4)"
>
> and I want to recover the tuple in a variable t
>
> t = (1, 2, 3, 4)
>
> how would you do ?
>
>
> --
> https://mail.python.org/mailman/listinfo/python-list



-- 
Matt Wheeler
http://funkyh.at
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: newbie question

2016-03-24 Thread ast
"David Palao"  a écrit dans le message de 
news:mailman.86.1458816553.2244.python-l...@python.org...

Hi,
Use "eval":
s = "(1, 2, 3, 4)"
t = eval(s)

Best




Thank you 


--
https://mail.python.org/mailman/listinfo/python-list


Re: newbie question

2016-03-24 Thread David Palao
Hi,
Use "eval":
s = "(1, 2, 3, 4)"
t = eval(s)

Best

2016-03-24 11:39 GMT+01:00 ast :
> Hi
>
> I have a string which contains a tupe, eg:
>
> s = "(1, 2, 3, 4)"
>
> and I want to recover the tuple in a variable t
>
> t = (1, 2, 3, 4)
>
> how would you do ?
>
>
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


newbie question

2016-03-24 Thread ast

Hi

I have a string which contains a tupe, eg:

s = "(1, 2, 3, 4)"

and I want to recover the tuple in a variable t

t = (1, 2, 3, 4)

how would you do ?


--
https://mail.python.org/mailman/listinfo/python-list


Re: tkinter newbie question

2016-01-25 Thread KP
On Sunday, 24 January 2016 20:20:07 UTC-8, KP  wrote:
> See my code below (which works). I'd like to have the 2nd window as a class 
> in a separate unit. How do I code that unit and how do I call it from my 
> first unit?
> 
> As always, thanks for all help!
> 
> 
> 
> 
> #!/usr/bin/env python
> """
> """ 
> from tkinter import *
> from settings import *
> 
> class window1():
> 
> def open_window2(self):
> t = Toplevel(self.root)
> t.title('New window')
> t.geometry('262x65+200+250')
> t.transient(self.root)
> 
> def setup_menu(self):
> self.menubar = Menu(self.root)
> self.menu1 = Menu(self.menubar, tearoff=0 ) 
> self.menu1.add_command(label="Settings",   accelerator='Ctrl+S', 
> command=self.open_window2)
> self.menubar.add_cascade(label="Menu 1", menu=self.menu1)  
> self.root.config(menu=self.menubar)
> 
> def __init__(self):
> self.root = Tk()
> self.root.title('Window #1')
> self.setup_menu()
> self.root.geometry('800x600+200+200')
> #
> self.root.mainloop()
> 
> if __name__ == '__main__':
> 
> w1 = window1()

Thank you - much appreciated!
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: tkinter newbie question

2016-01-25 Thread KP
On Monday, 25 January 2016 08:22:12 UTC-8, KP  wrote:
> On Monday, 25 January 2016 00:51:34 UTC-8, Peter Otten  wrote:
> > KP wrote:
> > 
> > > See my code below (which works). 
> > 
> > >From the import of lowercase "tkinter" I conclude you are using Python 3.
> > 
> > > I'd like to have the 2nd window as a
> > > class in a separate unit. How do I code that unit and how do I call it
> > > from my first unit?
> > > 
> > > As always, thanks for all help!
> > 
> > Move the code from open_window2() into a class in settings.py, e. g.
> > 
> > 
> > import tkinter as tk # avoid star-import
> > 
> > class SettingsWindow(tk.Toplevel): # Class names start with uppercase letter
> ># Prefer self-explaining names
> > def __init__(self, root):
> > super().__init__(root)
> > self.title('New window')
> > self.geometry('262x65+200+250')
> > self.transient(root)
> > 
> > Then use it in your main script:
> > 
> > 
> > > #!/usr/bin/env python
> > > """
> > > """
> > > from tkinter import *
> > import settings
> > 
> > > class window1():
> > > 
> > > def open_window2(self):
> >   settings.SettingsWindow(self.root)
> > 
> > > def setup_menu(self):
> > > self.menubar = Menu(self.root)
> > > self.menu1 = Menu(self.menubar, tearoff=0 )
> > > self.menu1.add_command(label="Settings",   accelerator='Ctrl+S',
> > > command=self.open_window2) self.menubar.add_cascade(label="Menu
> > > 1", menu=self.menu1) self.root.config(menu=self.menubar)
> > > 
> > > def __init__(self):
> > > self.root = Tk()
> > > self.root.title('Window #1')
> > > self.setup_menu()
> > > self.root.geometry('800x600+200+200')
> > > #
> > > self.root.mainloop()
> > > 
> > > if __name__ == '__main__':
> > > 
> > > w1 = window1()
> 
> Dang - almost there. Using your code, I get the new window with the specified 
> geometry and its type is transient, as expected.
> 
> Its caption, however, is NOT the caption specified, but the caption of the 
> first window, leaving me with 2 windows with identical caption.
> 
> Any idea why?

Forget that post - mea culpa - figured it out - sorry!
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: tkinter newbie question

2016-01-25 Thread KP
On Monday, 25 January 2016 00:51:34 UTC-8, Peter Otten  wrote:
> KP wrote:
> 
> > See my code below (which works). 
> 
> >From the import of lowercase "tkinter" I conclude you are using Python 3.
> 
> > I'd like to have the 2nd window as a
> > class in a separate unit. How do I code that unit and how do I call it
> > from my first unit?
> > 
> > As always, thanks for all help!
> 
> Move the code from open_window2() into a class in settings.py, e. g.
> 
> 
> import tkinter as tk # avoid star-import
> 
> class SettingsWindow(tk.Toplevel): # Class names start with uppercase letter
># Prefer self-explaining names
> def __init__(self, root):
> super().__init__(root)
> self.title('New window')
> self.geometry('262x65+200+250')
> self.transient(root)
> 
> Then use it in your main script:
> 
> 
> > #!/usr/bin/env python
> > """
> > """
> > from tkinter import *
> import settings
> 
> > class window1():
> > 
> > def open_window2(self):
>   settings.SettingsWindow(self.root)
> 
> > def setup_menu(self):
> > self.menubar = Menu(self.root)
> > self.menu1 = Menu(self.menubar, tearoff=0 )
> > self.menu1.add_command(label="Settings",   accelerator='Ctrl+S',
> > command=self.open_window2) self.menubar.add_cascade(label="Menu
> > 1", menu=self.menu1) self.root.config(menu=self.menubar)
> > 
> > def __init__(self):
> > self.root = Tk()
> > self.root.title('Window #1')
> > self.setup_menu()
> > self.root.geometry('800x600+200+200')
> > #
> > self.root.mainloop()
> > 
> > if __name__ == '__main__':
> > 
> > w1 = window1()

Dang - almost there. Using your code, I get the new window with the specified 
geometry and its type is transient, as expected.

Its caption, however, is NOT the caption specified, but the caption of the 
first window, leaving me with 2 windows with identical caption.

Any idea why?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: tkinter newbie question

2016-01-25 Thread Peter Otten
KP wrote:

> See my code below (which works). 

>From the import of lowercase "tkinter" I conclude you are using Python 3.

> I'd like to have the 2nd window as a
> class in a separate unit. How do I code that unit and how do I call it
> from my first unit?
> 
> As always, thanks for all help!

Move the code from open_window2() into a class in settings.py, e. g.


import tkinter as tk # avoid star-import

class SettingsWindow(tk.Toplevel): # Class names start with uppercase letter
   # Prefer self-explaining names
def __init__(self, root):
super().__init__(root)
self.title('New window')
self.geometry('262x65+200+250')
self.transient(root)

Then use it in your main script:


> #!/usr/bin/env python
> """
> """
> from tkinter import *
import settings

> class window1():
> 
> def open_window2(self):
  settings.SettingsWindow(self.root)

> def setup_menu(self):
> self.menubar = Menu(self.root)
> self.menu1 = Menu(self.menubar, tearoff=0 )
> self.menu1.add_command(label="Settings",   accelerator='Ctrl+S',
> command=self.open_window2) self.menubar.add_cascade(label="Menu
> 1", menu=self.menu1) self.root.config(menu=self.menubar)
> 
> def __init__(self):
> self.root = Tk()
> self.root.title('Window #1')
> self.setup_menu()
> self.root.geometry('800x600+200+200')
> #
> self.root.mainloop()
> 
> if __name__ == '__main__':
> 
> w1 = window1()


-- 
https://mail.python.org/mailman/listinfo/python-list


tkinter newbie question

2016-01-24 Thread KP
See my code below (which works). I'd like to have the 2nd window as a class in 
a separate unit. How do I code that unit and how do I call it from my first 
unit?

As always, thanks for all help!




#!/usr/bin/env python
"""
""" 
from tkinter import *
from settings import *

class window1():

def open_window2(self):
t = Toplevel(self.root)
t.title('New window')
t.geometry('262x65+200+250')
t.transient(self.root)

def setup_menu(self):
self.menubar = Menu(self.root)
self.menu1 = Menu(self.menubar, tearoff=0 ) 
self.menu1.add_command(label="Settings",   accelerator='Ctrl+S', 
command=self.open_window2)
self.menubar.add_cascade(label="Menu 1", menu=self.menu1)  
self.root.config(menu=self.menubar)

def __init__(self):
self.root = Tk()
self.root.title('Window #1')
self.setup_menu()
self.root.geometry('800x600+200+200')
#
self.root.mainloop()

if __name__ == '__main__':

w1 = window1()
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-09 Thread Rustom Mody
On Monday, March 9, 2015 at 12:05:05 PM UTC+5:30, Steven D'Aprano wrote:
 Chris Angelico wrote:
 
  As to the notion of rejecting the construction of strings containing
  these invalid codepoints, I'm not sure. Are there any languages out
  there that have a Unicode string type that requires that all
  codepoints be valid (no surrogates, no U+FFFE, etc)?
 
 U+FFFE and U+ are *noncharacters*, not invalid. There are a total of 66
 noncharacters in Unicode, and they are legal in strings.

Interesting -- Thanks!
I wonder whether that's one more instance of the anti-pattern (other thread)?
Number thats not a number -- Nan
Pointer that points nowhere -- NULL
SQL data thats not there but there -- null

 
 http://www.unicode.org/faq/private_use.html#nonchar8
 
 I think the only illegal code points are surrogates. Surrogates should only
 appear as bytes in UTF-16 byte-strings.

Even more interesting: So there's a whole hierarchy of illegality??
Could you suggest some good reference for 'surrogate'?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-09 Thread Marko Rauhamaa
Ben Finney ben+pyt...@benfinney.id.au:

 Steven D'Aprano steve+comp.lang.pyt...@pearwood.info writes:

 '\udd00' should be a SyntaxError.

 I find your argument convincing, that attempting to construct a
 Unicode string of a lone surrogate should be an error.

Then we're back to square one:

b'\x80'.decode('utf-8', errors='surrogateescape')
   '\udc80'


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-09 Thread Chris Angelico
On Mon, Mar 9, 2015 at 5:34 PM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
 Chris Angelico wrote:

 As to the notion of rejecting the construction of strings containing
 these invalid codepoints, I'm not sure. Are there any languages out
 there that have a Unicode string type that requires that all
 codepoints be valid (no surrogates, no U+FFFE, etc)?

 U+FFFE and U+ are *noncharacters*, not invalid. There are a total of 66
 noncharacters in Unicode, and they are legal in strings.

 http://www.unicode.org/faq/private_use.html#nonchar8

 I think the only illegal code points are surrogates. Surrogates should only
 appear as bytes in UTF-16 byte-strings.

U+FFFE would cause problems at the beginning of a UTF-16 stream, as it
could be mistaken for a BOM - that's why it's a noncharacter. But
sure, let's leave them out of the discussion. The question is whether
surrogates are legal or not.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-09 Thread Steven D'Aprano
Chris Angelico wrote:

 As to the notion of rejecting the construction of strings containing
 these invalid codepoints, I'm not sure. Are there any languages out
 there that have a Unicode string type that requires that all
 codepoints be valid (no surrogates, no U+FFFE, etc)?

U+FFFE and U+ are *noncharacters*, not invalid. There are a total of 66
noncharacters in Unicode, and they are legal in strings.

http://www.unicode.org/faq/private_use.html#nonchar8

I think the only illegal code points are surrogates. Surrogates should only
appear as bytes in UTF-16 byte-strings.



-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-08 Thread Chris Angelico
On Mon, Mar 9, 2015 at 5:25 AM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
 Marko Rauhamaa wrote:

 Chris Angelico ros...@gmail.com:

 Once again, you appear to be surprised that invalid data is failing.
 Why is this so strange? U+DD00 is not a valid character.

 But it is a valid non-character code point.

 It is quite correct to throw this error.

 '\udd00' is a valid str object:

 Is it though? Perhaps the bug is not UTF-8's inability to encode lone
 surrogates, but that Python allows you to create lone surrogates in the
 first place. That's not a rhetorical question. It's a genuine question.

Ah, I see the confusion. Yes, it is plausible to permit the UTF-8-like
encoding of surrogates; but it's illegal according to the RFC:

https://tools.ietf.org/html/rfc3629

   The definition of UTF-8 prohibits encoding character numbers between
   U+D800 and U+DFFF, which are reserved for use with the UTF-16
   encoding form (as surrogate pairs) and do not directly represent
   characters.


They're not valid characters, and the UTF-8 spec explicitly says that
they must not be encoded. Python is fully spec-compliant in rejecting
these. Some encoders [1] will permit them, but the resulting stream is
invalid UTF-8, just as CESU-8 and Modified UTF-8 are (the latter being
UTF-8, only U+ is represented as C0 80).

ChrisA

[1] eg 
http://pike.lysator.liu.se/generated/manual/modref/ex/predef_3A_3A/string_to_utf8.html
optionally
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-08 Thread Chris Angelico
On Mon, Mar 9, 2015 at 5:25 AM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
 Perhaps the bug is not UTF-8's inability to encode lone
 surrogates, but that Python allows you to create lone surrogates in the
 first place. That's not a rhetorical question. It's a genuine question.

As to the notion of rejecting the construction of strings containing
these invalid codepoints, I'm not sure. Are there any languages out
there that have a Unicode string type that requires that all
codepoints be valid (no surrogates, no U+FFFE, etc)? This is the kind
of thing that's usually done in an obscure language before it hits a
mainstream one.

Pike is similar to Python here. I can create a string with invalid
code points in it:

 \uFFFE\uDD00;
(1) Result: \ufffe\udd00

but I can't UTF-8 encode that:

 string_to_utf8(\uFFFE\uDD00);
Character 0xdd00 at index 1 is in the surrogate range and therefore invalid.
Unknown program: string_to_utf8(\ufffe\udd00)
HilfeInput:1: HilfeInput()-___HilfeWrapper()

Or, using the streaming UTF-8 encoder instead of the short-hand:

 Charset.encoder(UTF-8)-feed(\uFFFE\uDD00)-drain();
Error encoding \ufffe[0xdd00] using utf8: Unsupported character 56576.
/usr/local/pike/8.1.0/lib/modules/_Charset.so:1:
_Charset.UTF8enc()-feed(\ufffe\udd00)
HilfeInput:1: HilfeInput()-___HilfeWrapper()

Does anyone know of a language where you can't even construct the string?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-08 Thread Steven D'Aprano
Rustom Mody wrote:

 On Saturday, March 7, 2015 at 4:39:48 PM UTC+5:30, Steven D'Aprano wrote:
 Rustom Mody wrote:
  This includes not just bug-prone-system code such as Java and Windows
  but seemingly working code such as python 3.
 
 What Unicode bugs do you think Python 3.3 and above have?
 
 Literal/Legalistic answer:
 https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2012-2135

Nice one :-) but not exactly in the spirit of what we're discussing (as you
acknowledge below), so I won't discuss that.


 [And already quoted at
 http://blog.languager.org/2015/03/whimsical-unicode.html
 ]
 
 An answer more in the spirit of what I am trying to say:
 Idle3, Roy's example and in general all systems that are
 python-centric but use components outside of python that are
 unicode-broken
 
 IOW I would expect people (at least people with good faith) reading my
 
 bug-prone-system code...seemingly working code such as python 3...
 
 to interpret that NOT as
 
 python 3 is seemingly working but actually broken


Why not? That is the natural interpretation of the sentence, particularly in
the context of your previous sentence:

[quote]
Or you can skip the blame-game and simply note the fact that 
large segments of extant code-bases are currently in bug-prone
or plain buggy state.

This includes not just bug-prone-system code such as Java and
Windows but seemingly working code such as python 3.
[end quote]


The natural interpretation of this is that Python 3 is only *seemingly*
working, but is also an example of a code base in bug-prone or plain buggy
state.

If that's not your intended meaning, then rather than casting aspersions on
my honesty (good faith indeed) you might accept that perhaps you didn't
quite manage to get your message across.


 But as
 
 Apps made with working system code (eg python3) can end up being broken
 because of other non-working system code - eg mysql, java, javascript,
 windows-shell, and ultimately windows, linux

Don't forget viruses or other malware, cosmic rays, processor bugs, dry
solder joints on the motherboard, faulty memory, and user-error.

I'm not sure what point you think you are making. If you want to discuss the
fact that complex systems have more interactions than simple systems, and
therefore more ways for things to go wrong, I will agree. I'll agree that
this is an issue with Python code that interacts with other systems which
may or may not implement Unicode correctly. There are a few ways to
interpret this:

(1) You're making a general point about the complexity of modern computing.

(2) You're making the point that dealing with text encodings in general, and
Unicode in specific, is hard because of the interaction of programming
language, database, file system, locale, etc.

(3) You're implying that Python ought to fix this problem some how.

(4) You're implying that *Unicode* specifically is uniquely problematic in
this way. Or at least *unusual* to be problematic in this way.


I will agree with 1 and 2; I'll say that 3 would be nice but in the absence
of concrete proposals for how to fix it, it's just meaningless chatter. And
I'll disagree strongly with 4.

Unicode came into existence because legacy encodings suffer from similar
problems, only worse. (One major advantage of Unicode over previous
multi-byte encodings is that the UTF encodings are self-healing. A single
corrupted byte will, *at worst*, cause a single corrupted code point.)

In one sense, Unicode has solved these legacy encoding problems, in the
sense that if you always use a correct implementation of Unicode then you
won't *ever* suffer from problems like moji-bake, broken strings and so
forth.

In another sense, Unicode hasn't solved these legacy problems because we
still have to deal with files using legacy encodings, as well as standards
organisations, operating systems, developers, applications and users who
continue to produce new content using legacy encodings, buggy or incorrect
implementations of the standard, also viruses, cosmic rays, dry solder
joints and user-error. How are these things Unicode's fault or
responsibility?



-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-08 Thread Steven D'Aprano
Marko Rauhamaa wrote:

 Chris Angelico ros...@gmail.com:
 
 Once again, you appear to be surprised that invalid data is failing.
 Why is this so strange? U+DD00 is not a valid character. 

But it is a valid non-character code point.


 It is quite correct to throw this error.
 
 '\udd00' is a valid str object:

Is it though? Perhaps the bug is not UTF-8's inability to encode lone
surrogates, but that Python allows you to create lone surrogates in the
first place. That's not a rhetorical question. It's a genuine question.


 '\udd00'
'\udd00'
 '\udd00'.encode('utf-32')
b'\xff\xfe\x00\x00\x00\xdd\x00\x00'
 '\udd00'.encode('utf-16')
b'\xff\xfe\x00\xdd'

If you explicitly specify the endianness (say, utf-16-be or -le) then you
don't get the BOMs.

 I was simply stating that UTF-8 is not a bijection between unicode
 strings and octet strings (even forgetting Python). Enriching Unicode
 with 128 surrogates (U+DC80..U+DCFF) establishes a bijection, but not
 without side effects.



-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-08 Thread Marko Rauhamaa
Steven D'Aprano steve+comp.lang.pyt...@pearwood.info:

 Marko Rauhamaa wrote:
 '\udd00' is a valid str object:

 Is it though? Perhaps the bug is not UTF-8's inability to encode lone
 surrogates, but that Python allows you to create lone surrogates in
 the first place. That's not a rhetorical question. It's a genuine
 question.

The problem is that no matter how you shuffle surrogates, encoding
schemes, coding points and the like, a wrinkle always remains.

I'm reminded of number sets where you go from ℕ to ℤ to ℚ to ℝ to ℂ. But
that's where the buck stops; traditional arithmetic functions are closed
under ℂ.

Unicode apparently hasn't found a similar closure.

That's why I think that while UTF-8 is a fabulous way to bring Unicode
to Linux, Linux should have taken the tack that Unicode is always an
application-level interpretation with few operating system tie-ins.
Unfortunately, the GNU world is busy trying to build a Unicode frosting
everywhere. The illusion can never be complete but is convincing enough
for application developers to forget to handle corner cases.

To answer your question, I think every code point from 0 to 1114111
should be treated as valid and analogous. Thus Python is correct here:

len('\udd00')
   1
len('\ufeff')
   1

The alternatives are far too messy to consider.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-08 Thread Steven D'Aprano
Steven D'Aprano wrote:

 Marko Rauhamaa wrote:
 
 Steven D'Aprano steve+comp.lang.pyt...@pearwood.info:
 
 Marko Rauhamaa wrote:

 That said, UTF-8 does suffer badly from its not being
 a bijective mapping.

 Can you explain?
 
 In Python terms, there are bytes objects b that don't satisfy:
 
b.decode('utf-8').encode('utf-8') == b
 
 Are you talking about the fact that not all byte streams are valid UTF-8?
 That is, some byte objects b may raise an exception on b.decode('utf-8').

Eh, I should have read the rest of the thread before replying...


 I don't see why that means UTF-8 suffers badly from this. Can you give
 an example of where you would expect to take an arbitrary byte-stream,
 decode it as UTF-8, and expect the results to be meaningful?

File names on Unix-like systems.

Unfortunately file names are a bit of a mess, but we're slowly converging on
Unicode support for files. I reckon that by 2070, 2080 tops, we'll have
that licked...

The three major operating systems have different levels of support for
Unicode file names:

* Apple OS X: HFS+ stores file names in decomposed form, using UTF-16. I
think this is the strictest Unicode support of all common file systems.
Well done Apple. Decomposed in this sense means that single code points may
be expanded where possible, e.g. é U+00E9 LATIN SMALL LETTER E WITH ACUTE
will be stored as two code points, U+0065 LATIN SMALL LETTER E + U+0301
COMBINING ACUTE ACCENT.

* Windows: NTFS stores file names as sequences of 16-bit code units except
0x. (Additional restrictions also apply: e.g. in POSIX mode, / is also
forbidden; in Win32 mode, / ? + etc. are forbidden.) The code units are
interpreted as UTF-16 but the file system doesn't prevent you from creating
file names with invalid sequences.

* Linux: ext2/ext3 stores file names as arbitrary bytes except for / and
nul. However most Linux distributions treat file names as if they were
UTF-8 (displaying ? glyphs for undecodable bytes), and many Linux GUI file
managers enforce the rule that file names are valid UTF-8.

File systems on removable media (FAT32, UDF, ISO-9660 with or without
extensions such as Joliet and Rock Ridge) have their own issues, but
generally speaking don't support Unicode well or at all.

So although the current situation is still a bit of a mess, there is a slow
move towards file names which are valid Unicode.


-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-08 Thread Marko Rauhamaa
Chris Angelico ros...@gmail.com:

 Once again, you appear to be surprised that invalid data is failing.
 Why is this so strange? U+DD00 is not a valid character. It is quite
 correct to throw this error.

'\udd00' is a valid str object:

'\udd00'
   '\udd00'
'\udd00'.encode('utf-32')
   b'\xff\xfe\x00\x00\x00\xdd\x00\x00'
'\udd00'.encode('utf-16')
   b'\xff\xfe\x00\xdd'

I was simply stating that UTF-8 is not a bijection between unicode
strings and octet strings (even forgetting Python). Enriching Unicode
with 128 surrogates (U+DC80..U+DCFF) establishes a bijection, but not
without side effects.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-08 Thread Chris Angelico
On Sun, Mar 8, 2015 at 7:09 PM, Marko Rauhamaa ma...@pacujo.net wrote:
 Chris Angelico ros...@gmail.com:

 Once again, you appear to be surprised that invalid data is failing.
 Why is this so strange? U+DD00 is not a valid character. It is quite
 correct to throw this error.

 '\udd00' is a valid str object:

 '\udd00'
'\udd00'
 '\udd00'.encode('utf-32')
b'\xff\xfe\x00\x00\x00\xdd\x00\x00'
 '\udd00'.encode('utf-16')
b'\xff\xfe\x00\xdd'

 I was simply stating that UTF-8 is not a bijection between unicode
 strings and octet strings (even forgetting Python). Enriching Unicode
 with 128 surrogates (U+DC80..U+DCFF) establishes a bijection, but not
 without side effects.

But it's not a valid Unicode string, so a Unicode encoding can't be
expected to cope with it. Mathematically, 0xC0 0x80 would represent
U+, and some UTF-8 codecs generate and accept this (in order to
allow U+ without ever yielding 0x00), but that doesn't mean that
UTF-8 should allow that byte sequence.

The only reason to craft some kind of Unicode string for any arbitrary
sequence of bytes is the smuggling effect used for file name
handling. There is no reason to support invalid Unicode codepoints.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-08 Thread Rustom Mody
On Monday, March 9, 2015 at 7:39:42 AM UTC+5:30, Cameron Simpson wrote:
 On 07Mar2015 22:09, Steven D'Aprano  wrote:
 Rustom Mody wrote:
 [...big snip...]
  Some parts are here some earlier and from my memory.
  If details wrong please correct:
  - 200 million records
  - Containing 4 strings with SMP characters
  - System made with python and mysql. SMP works with python, breaks mysql.
So whole system broke due to those 4 in 200,000,000 records
 
 No, they broke because MySQL has buggy Unicode handling.
 [...]
  You could also choose do with astral crap (Roy's words) what we all do
  with crap -- throw it out as early as possible.
 
 And when Roy's customers demand that his product support emoji, or complain
 that they cannot spell their own name because of his parochial and ignorant
 idea of crap, perhaps he will consider doing what he should have done
 from the beginning:
 
 Stop using MySQL, which is a joke of a database[1], and use Postgres which
 does not have this problem.
 
 [1] So I have been told.
 
 I use MySQL a fair bit, and Postgres very slightly. I would agree with your 
 characterisation above; MySQL is littered with inconsistencies and arbitrary 
 breakage, both in tools and SQL implementation. And Postgres has been a pure 
 pleasure to work with, little though I have done that so far.
 
 Cheers,
 Cameron Simpson
 
 There is no human problem which could not be solved if people would simply
 do as I advise. - Gore Vidal

I think that last quote sums up the issue best.
Ive written to Intel asking them to make their next generation have 21-bit wide 
bytes.
Once they do that we will be back in the paradise we have been for the last 40 
years
which I call the 'Unix-assumption'
http://blog.languager.org/2014/04/unicode-and-unix-assumption.html

Until then...

We have to continue living in the real world.
Which includes 10 times more windows than linux users.
Is windows 10 times better an OS than linux?

In the 'real world' people make choices for all sorts of reasons. My guess is 
the
top reason is the pointiness of the hair of pointy-haired-boss.

Just like people choose  windows over linux, people choose mysql over postgres,
and that's the context of this discussion -- people stuck in sub-optimal choices
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-08 Thread Ben Finney
Steven D'Aprano steve+comp.lang.pyt...@pearwood.info writes:

 '\udd00' should be a SyntaxError.

I find your argument convincing, that attempting to construct a Unicode
string of a lone surrogate should be an error.

Shouldn't the error type be a ValueError, though? The statement is not,
to my mind, erroneous syntax.

-- 
 \ “Please do not feed the animals. If you have any suitable food, |
  `\ give it to the guard on duty.” —zoo, Budapest |
_o__)  |
Ben Finney

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-08 Thread Cameron Simpson

On 07Mar2015 22:09, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info 
wrote:

Rustom Mody wrote:

[...big snip...]
Some parts are here some earlier and from my memory.
If details wrong please correct:
- 200 million records
- Containing 4 strings with SMP characters
- System made with python and mysql. SMP works with python, breaks mysql.
  So whole system broke due to those 4 in 200,000,000 records


No, they broke because MySQL has buggy Unicode handling.

[...]

You could also choose do with astral crap (Roy's words) what we all do
with crap -- throw it out as early as possible.


And when Roy's customers demand that his product support emoji, or complain
that they cannot spell their own name because of his parochial and ignorant
idea of crap, perhaps he will consider doing what he should have done
from the beginning:

Stop using MySQL, which is a joke of a database[1], and use Postgres which
does not have this problem.

[1] So I have been told.


I use MySQL a fair bit, and Postgres very slightly. I would agree with your 
characterisation above; MySQL is littered with inconsistencies and arbitrary 
breakage, both in tools and SQL implementation. And Postgres has been a pure 
pleasure to work with, little though I have done that so far.


Cheers,
Cameron Simpson c...@zip.com.au

There is no human problem which could not be solved if people would simply
do as I advise. - Gore Vidal
--
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-08 Thread Chris Angelico
On Mon, Mar 9, 2015 at 1:09 PM, Ben Finney ben+pyt...@benfinney.id.au wrote:
 Steven D'Aprano steve+comp.lang.pyt...@pearwood.info writes:

 '\udd00' should be a SyntaxError.

 I find your argument convincing, that attempting to construct a Unicode
 string of a lone surrogate should be an error.

 Shouldn't the error type be a ValueError, though? The statement is not,
 to my mind, erroneous syntax.

For the string literal, I would say SyntaxError is more appropriate
than ValueError, as a string object has to be constructed at
compilation time.

I'd still like to see a report from someone who has used a language
that specifically disallows all surrogates in strings. Does it help?
Is it more hassle than it's worth? Are there weird edge cases that it
breaks?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-08 Thread random832
On Sun, Mar 8, 2015, at 22:09, Ben Finney wrote:
 Steven D'Aprano steve+comp.lang.pyt...@pearwood.info writes:
 
  '\udd00' should be a SyntaxError.
 
 I find your argument convincing, that attempting to construct a Unicode
 string of a lone surrogate should be an error.
 
 Shouldn't the error type be a ValueError, though? The statement is not,
 to my mind, erroneous syntax.

In this hypothetical, it's a problem with evaluating a literal - in the
same way that '\U12345', or '\U0011, is.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-08 Thread Steven D'Aprano
Marko Rauhamaa wrote:

 Steven D'Aprano steve+comp.lang.pyt...@pearwood.info:
 
 Marko Rauhamaa wrote:
 '\udd00' is a valid str object:

 Is it though? Perhaps the bug is not UTF-8's inability to encode lone
 surrogates, but that Python allows you to create lone surrogates in
 the first place. That's not a rhetorical question. It's a genuine
 question.
 
 The problem is that no matter how you shuffle surrogates, encoding
 schemes, coding points and the like, a wrinkle always remains.

Really? Define your terms. Can you define wrinkles, and prove that it is
impossible to remove them? What's so bad about wrinkles anyway?


 I'm reminded of number sets where you go from ℕ to ℤ to ℚ to ℝ to ℂ. But
 that's where the buck stops; traditional arithmetic functions are closed
 under ℂ.

That's simply incorrect. What's z/(0+0i)?

There are many more number sets used by mathematicians, some going back to
the 1800s. Here are just a few:

* ℝ-overbar or [−∞, +∞], which adds a pair of infinities to ℝ.

* ℝ-caret or ℝ+{∞}, which does the same but with a single 
  unsigned infinity.

* A similar extended version of ℂ with a single infinity.

* Split-complex or hyperbolic numbers, defined similarly to ℂ 
  except with i**2 = +1 (rather than the complex i**2 = -1).

* Dual numbers, which add a single infinitesimal number ε != 0 
  with the property that ε**2 = 0.

* Hyperreal numbers.

* John Conway's surreal numbers, which may be the largest 
  possible set, in the sense that it can construct all finite, 
  infinite and infinitesimal numbers. (The hyperreals and dual 
  numbers can be considered subsets of the surreals.)

The process of extending ℝ to ℂ is formally known as Cayley–Dickson
construction, and there is an infinite number of algebras (and hence number
sets) which can be constructed this way. The next few are:

* Hamilton's quaternions ℍ, very useful for dealing with rotations 
  in 3D space. They fell out of favour for some decades, but are now
  experiencing something of a renaissance.

* Octonions or Cayley numbers.

* Sedenions.


 Unicode apparently hasn't found a similar closure.

Similar in what way? And why do you think this is important?

It is not a requirement for every possible byte sequence to be a valid
Unicode string, any more than it is a requirement for every possible byte
sequence to be valid JPG, zip archive, or ELF executable. Some byte strings
simply are not JPG images, zip archives or ELF executables -- or Unicode
strings. So what?

Why do you think that is a problem that needs fixing by the Unicode
standard? It may be a problem that needs fixing by (for example)
programming languages, and Python invented the surrogatesescape encoding to
smuggle such invalid bytes into strings. Other solutions may exist as well.
But that's not part of Unicode and it isn't a problem for Unicode.


 That's why I think that while UTF-8 is a fabulous way to bring Unicode
 to Linux, Linux should have taken the tack that Unicode is always an
 application-level interpretation with few operating system tie-ins.

Should have? That is *exactly* the status quo, and while it was the only
practical solution given Linux's history, it's a horrible idea. That
Unicode is stuck on top of an OS which is unaware of Unicode is precisely
why we're left with problems like how do you represent arbitrary bytes as
Unicode strings?.


 Unfortunately, the GNU world is busy trying to build a Unicode frosting
 everywhere. The illusion can never be complete but is convincing enough
 for application developers to forget to handle corner cases.
 
 To answer your question, I think every code point from 0 to 1114111
 should be treated as valid and analogous. 

Your opinion isn't very relevant. What is relevant is what the Unicode
standard demands, and I think it requires that strings containing
surrogates are illegal (rather like x/0 is illegal in the real numbers).
Wikipedia states:


The Unicode standard permanently reserves these code point 
values [U+D800 to U+DFFF] for UTF-16 encoding of the high 
and low surrogates, and they will never be assigned a 
character, so there should be no reason to encode them. The 
official Unicode standard says that no UTF forms, including 
UTF-16, can encode these code points.

However UCS-2, UTF-8, and UTF-32 can encode these code points
in trivial and obvious ways, and large amounts of software 
does so even though the standard states that such arrangements
should be treated as encoding errors. It is possible to 
unambiguously encode them in UTF-16 by using a code unit equal
to the code point, as long as no sequence of two code units can
be interpreted as a legal surrogate pair (that is, as long as a
high surrogate is never followed by a low surrogate). The 
majority of UTF-16 encoder and decoder implementations translate
between encodings as though this were the case.


http://en.wikipedia.org/wiki/UTF-16

So yet again we are left with the 

Re: Newbie question about text encoding

2015-03-07 Thread Marko Rauhamaa
Steven D'Aprano steve+comp.lang.pyt...@pearwood.info:

 For those cases where you do wish to take an arbitrary byte stream and
 round-trip it, Python now provides an error handler for that.

 py import random
 py b = bytes([random.randint(0, 255) for _ in range(1)])
 py s = b.decode('utf-8')
 Traceback (most recent call last):
   File stdin, line 1, in module
 UnicodeDecodeError: 'utf-8' codec can't decode byte 0x94 in position 0:
 invalid start byte
 py s = b.decode('utf-8', errors='surrogateescape')
 py s.encode('utf-8', errors='surrogateescape') == b
 True

That is indeed a valid workaround. With it we achieve

   b.decode('utf-8', errors='surrogateescape'). \
   encode('utf-8', errors='surrogateescape') == b

for any bytes b. It goes to great lengths to address the Linux
programmer's situation.

However,

 * it's not UTF-8 but a variant of it,

 * it sacrifices the ordering correspondence of UTF-8:

'\udc80'  'ä'
   True
'\udc80'.encode('utf-8', errors='surrogateescape')  \
   ...'ä'.encode('utf-8', errors='surrogateescape')
   False

 * it still isn't bijective between str and bytes:

'\udd00'.encode('utf-8', errors='surrogateescape')
   Traceback (most recent call last):
 File stdin, line 1, in module
   UnicodeEncodeError: 'utf-8' codec can't encode character 
   '\udd00' in position 0: surrogates not allowed


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-07 Thread Rustom Mody
On Saturday, March 7, 2015 at 4:39:48 PM UTC+5:30, Steven D'Aprano wrote:
 Rustom Mody wrote:
  This includes not just bug-prone-system code such as Java and Windows but
  seemingly working code such as python 3.
 
 What Unicode bugs do you think Python 3.3 and above have?

Literal/Legalistic answer:
https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2012-2135

[And already quoted at
http://blog.languager.org/2015/03/whimsical-unicode.html
]

An answer more in the spirit of what I am trying to say:
Idle3, Roy's example and in general all systems that are
python-centric but use components outside of python that are unicode-broken

IOW I would expect people (at least people with good faith) reading my

 bug-prone-system code...seemingly working code such as python 3...

to interpret that NOT as

python 3 is seemingly working but actually broken

But as

Apps made with working system code (eg python3) can end up being broken
because of other non-working system code - eg mysql, java, javascript, 
windows-shell, and ultimately windows, linux
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-07 Thread Chris Angelico
On Sun, Mar 8, 2015 at 6:20 PM, Marko Rauhamaa ma...@pacujo.net wrote:
  * it still isn't bijective between str and bytes:

 '\udd00'.encode('utf-8', errors='surrogateescape')
Traceback (most recent call last):
  File stdin, line 1, in module
UnicodeEncodeError: 'utf-8' codec can't encode character
'\udd00' in position 0: surrogates not allowed

Once again, you appear to be surprised that invalid data is failing.
Why is this so strange? U+DD00 is not a valid character. It is quite
correct to throw this error.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-07 Thread Marko Rauhamaa
Steven D'Aprano steve+comp.lang.pyt...@pearwood.info:

 Marko Rauhamaa wrote:

 That said, UTF-8 does suffer badly from its not being
 a bijective mapping.

 Can you explain?

In Python terms, there are bytes objects b that don't satisfy:

   b.decode('utf-8').encode('utf-8') == b


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-07 Thread Chris Angelico
On Sun, Mar 8, 2015 at 2:48 AM, Marko Rauhamaa ma...@pacujo.net wrote:
 Steven D'Aprano steve+comp.lang.pyt...@pearwood.info:

 Marko Rauhamaa wrote:

 That said, UTF-8 does suffer badly from its not being
 a bijective mapping.

 Can you explain?

 In Python terms, there are bytes objects b that don't satisfy:

b.decode('utf-8').encode('utf-8') == b

Please provide an example; that sounds like a bug. If there is any
invalid UTF-8 stream which decodes without an error, it is actually a
security bug, and should be fixed pronto in all affected and supported
versions.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-07 Thread Chris Angelico
On Sun, Mar 8, 2015 at 3:25 AM, Marko Rauhamaa ma...@pacujo.net wrote:
 Chris Angelico ros...@gmail.com:

 On Sun, Mar 8, 2015 at 2:48 AM, Marko Rauhamaa ma...@pacujo.net wrote:
 Steven D'Aprano steve+comp.lang.pyt...@pearwood.info:

 Marko Rauhamaa wrote:

 That said, UTF-8 does suffer badly from its not being
 a bijective mapping.

 Can you explain?

 In Python terms, there are bytes objects b that don't satisfy:

b.decode('utf-8').encode('utf-8') == b

 Please provide an example; that sounds like a bug. If there is any
 invalid UTF-8 stream which decodes without an error, it is actually a
 security bug, and should be fixed pronto in all affected and supported
 versions.

 Here's an example:

b = b'\x80'

 Yes, it generates an exception. IOW, UTF-8 is not a bijective mapping
 from str objects to bytes objects.

That's not the same as what you said. All you've proven is that there
are bit patterns which are not UTF-8 streams... which is a very
deliberate feature. How does UTF-8 *suffer* from this? It benefits
hugely!

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-07 Thread Mark Lawrence

On 07/03/2015 16:25, Marko Rauhamaa wrote:

Chris Angelico ros...@gmail.com:


On Sun, Mar 8, 2015 at 2:48 AM, Marko Rauhamaa ma...@pacujo.net wrote:

Steven D'Aprano steve+comp.lang.pyt...@pearwood.info:


Marko Rauhamaa wrote:


That said, UTF-8 does suffer badly from its not being
a bijective mapping.


Can you explain?


In Python terms, there are bytes objects b that don't satisfy:

b.decode('utf-8').encode('utf-8') == b


Please provide an example; that sounds like a bug. If there is any
invalid UTF-8 stream which decodes without an error, it is actually a
security bug, and should be fixed pronto in all affected and supported
versions.


Here's an example:

b = b'\x80'

Yes, it generates an exception. IOW, UTF-8 is not a bijective mapping
from str objects to bytes objects.



Python 2 might, Python 3 doesn't.

--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-07 Thread Marko Rauhamaa
Chris Angelico ros...@gmail.com:

 On Sun, Mar 8, 2015 at 3:25 AM, Marko Rauhamaa ma...@pacujo.net wrote:
 Marko Rauhamaa wrote:
 That said, UTF-8 does suffer badly from its not being
 a bijective mapping.

 Here's an example:

b = b'\x80'

 Yes, it generates an exception. IOW, UTF-8 is not a bijective mapping
 from str objects to bytes objects.

 That's not the same as what you said.

Except that it's precisely what I said.

 All you've proven is that there are bit patterns which are not UTF-8
 streams...

And that causes problems.

 which is a very deliberate feature.

Well, nobody desired it. It was just something that had to give.

I believe you *could* have defined it as a bijective mapping but then
you would have lost the sorting order correspondence.

 How does UTF-8 *suffer* from this? It benefits hugely!

You can't operate on file names and text files using Python strings. Or
at least, you will need to add (nontrivial) exception catching logic.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-07 Thread Chris Angelico
On Sun, Mar 8, 2015 at 4:50 AM, Marko Rauhamaa ma...@pacujo.net wrote:
 There are two things happening here:

 1) The underlying file system is not UTF-8, and you can't depend on
 that,

 Correct. Linux pathnames are octet strings regardless of the locale.

 That's why Linux developers should refer to filenames using bytes.
 Unfortunately, Python itself violates that principle by having
 os.listdir() return str objects (to mention one example).

Only because you gave it a str with the path name. If you want to
refer to file names using bytes, then be consistent and refer to ALL
file names using bytes. As I demonstrated, that works just fine.

 2) You forgot to put the path on that, so it failed to find the file.
 Here's my version of your demo:

 open(/tmp/xyz/+os.listdir('/tmp/xyz')[0])
 _io.TextIOWrapper name='/tmp/xyz/\udc80' mode='r' encoding='UTF-8'

 Looks fine to me.

 I stand corrected.

 Then we have:

 os.listdir()[0].encode('utf-8')
Traceback (most recent call last):
  File stdin, line 1, in module
UnicodeEncodeError: 'utf-8' codec can't encode character '\udc80' in
position 0: surrogates not allowed

So?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-07 Thread Chris Angelico
On Sun, Mar 8, 2015 at 5:34 AM, Dan Sommers d...@tombstonezero.net wrote:
 I think we're all agreeing:  not all file systems are the same, and
 Python doesn't smooth out all of the bumps, even for something that
 seems as simple as displaying the names of files in a directory.  And
 that's *after* we've agreed that filesystems contain files in
 hierarchical directories.

I think you and I are in agreement. No idea about Marko, I'm still not
entirely sure what he's saying.

Python can't smooth out all of the bumps in file systems, any more
than Unicode can smooth out the bumps in natural language, or TCP can
smooth out the bumps in IP. The abstraction layers help, but every now
and then they leak, and you have to cope with the underlying mess.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-07 Thread Mark Lawrence

On 07/03/2015 16:48, Marko Rauhamaa wrote:

Mark Lawrence breamore...@yahoo.co.uk:


On 07/03/2015 16:25, Marko Rauhamaa wrote:

Here's an example:

 b = b'\x80'

Yes, it generates an exception. IOW, UTF-8 is not a bijective mapping
from str objects to bytes objects.


Python 2 might, Python 3 doesn't.


Python 3.3.2 (default, Dec  4 2014, 12:49:00)
[GCC 4.8.3 20140911 (Red Hat 4.8.3-7)] on linux
Type help, copyright, credits or license for more information.
 b'\x80'.decode('utf-8')
Traceback (most recent call last):
  File stdin, line 1, in module
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0:
invalid start byte


Marko



It would clearly help if you were to type in the correct UK English accent.

--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-07 Thread Marko Rauhamaa
Dan Sommers d...@tombstonezero.net:

 I think we're all agreeing: not all file systems are the same, and
 Python doesn't smooth out all of the bumps, even for something that
 seems as simple as displaying the names of files in a directory. And
 that's *after* we've agreed that filesystems contain files in
 hierarchical directories.

A whole new set of problems took root with Unicode. There were gains but
there were losses, too.

Python is not alone in the conceptual difficulties. Guile 2's (readdir)
simply converts bad UTF-8 in a filename into a question mark:

   scheme@(guile-user) [1] (readdir s)
   $3 = ?
   scheme@(guile-user) [4] (equal? $3 ?)
   $4 = #t

So does lxterminal:

   $ ls
   ?

even though it's all bytes on the inside:

   $ [ $(ls) = ? ]
   $ echo $?
   1

Scripts that make use of standard text utilities must now be very
careful:

   $ ls | egrep ^.$ | wc -l
   0

You are well advised to sprinkle LANG=C in your scripts:

   $ ls | LANG=C egrep ^.$ | wc -l
   1

Nasty locale-related bugs plague installation scripts, whose writers are
not accustomed to running their tests in myriads of locales. The topic
is of course larger than just Unicode.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-07 Thread Chris Angelico
On Sun, Mar 8, 2015 at 3:40 AM, Mark Lawrence breamore...@yahoo.co.uk wrote:
 Here's an example:

 b = b'\x80'

 Yes, it generates an exception. IOW, UTF-8 is not a bijective mapping
 from str objects to bytes objects.


 Python 2 might, Python 3 doesn't.

He was talking about this line of code:

b.decode('utf-8').encode('utf-8') == b

With the above assignment, that does indeed throw an error - which is
correct behaviour.

Challenge: Figure out a byte-string input that will make this function
return True.

def is_utf8_broken(b):
return b.decode('utf-8').encode('utf-8') != b

Correct responses for this function are either False or raising an exception.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-07 Thread Marko Rauhamaa
Mark Lawrence breamore...@yahoo.co.uk:

 It would clearly help if you were to type in the correct UK English
 accent.

Your ad-hominem-to-contribution ratio is alarmingly high.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-07 Thread Chris Angelico
On Sun, Mar 8, 2015 at 4:14 AM, Marko Rauhamaa ma...@pacujo.net wrote:
 See:

$ mkdir /tmp/xyz
$ touch /tmp/xyz/
 \x80'
$ python3
Python 3.3.2 (default, Dec  4 2014, 12:49:00)
[GCC 4.8.3 20140911 (Red Hat 4.8.3-7)] on linux
Type help, copyright, credits or license for more information.
 import os
 os.listdir('/tmp/xyz')
['\udc80']
 open(os.listdir('/tmp/xyz')[0])
Traceback (most recent call last):
  File stdin, line 1, in module
FileNotFoundError: [Errno 2] No such file or directory: '\udc80'

 File names encoded with Latin-X are quite commonplace even in UTF-8
 locales.

That is not a problem with UTF-8, though. I don't understand how
you're blaming UTF-8 for that. There are two things happening here:

1) The underlying file system is not UTF-8, and you can't depend on
that, ergo the decode to Unicode has to have some special handling of
failing bytes.
2) You forgot to put the path on that, so it failed to find the file.
Here's my version of your demo:

 open(/tmp/xyz/+os.listdir('/tmp/xyz')[0])
_io.TextIOWrapper name='/tmp/xyz/\udc80' mode='r' encoding='UTF-8'

Looks fine to me.

Alternatively, if you pass a byte string to os.listdir, you get back a
list of byte string file names:

 os.listdir(b/tmp/xyz)
[b'\x80']
 open(b/tmp/xyz/+os.listdir(b'/tmp/xyz')[0])
_io.TextIOWrapper name=b'/tmp/xyz/\x80' mode='r' encoding='UTF-8'

Either way works. You can use bytes or text, and if you use text,
there is a way to smuggle bytes through it. None of this has anything
to do with UTF-8 as an encoding. (Note that the encoding='UTF-8'
note in the response has to do with the presumed encoding of the file
contents, not of the file name. As an empty file, it can be considered
to be a stream of zero Unicode characters, encoded UTF-8, so that's
valid.)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-07 Thread Mark Lawrence

On 07/03/2015 17:16, Marko Rauhamaa wrote:

Mark Lawrence breamore...@yahoo.co.uk:


It would clearly help if you were to type in the correct UK English
accent.


Your ad-hominem-to-contribution ratio is alarmingly high.


Marko



You've been a PITA ever since you first joined this list, what about it?

--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-07 Thread Dan Sommers
On Sun, 08 Mar 2015 05:13:09 +1100, Chris Angelico wrote:

 On Sun, Mar 8, 2015 at 5:02 AM, Dan Sommers d...@tombstonezero.net wrote:
 On Sun, 08 Mar 2015 04:59:56 +1100, Chris Angelico wrote:

 On Sun, Mar 8, 2015 at 4:50 AM, Marko Rauhamaa ma...@pacujo.net wrote:

 Correct. Linux pathnames are octet strings regardless of the locale.

 That's why Linux developers should refer to filenames using bytes.
 Unfortunately, Python itself violates that principle by having
 os.listdir() return str objects (to mention one example).

 Only because you gave it a str with the path name. If you want to
 refer to file names using bytes, then be consistent and refer to ALL
 file names using bytes. As I demonstrated, that works just fine.

 Python 3.4.2 (default, Oct  8 2014, 10:45:20)
 [GCC 4.9.1] on linux
 Type help, copyright, credits or license for more information.
 import os
 type(os.listdir(os.curdir)[0])
 class 'str'
 
 Help on module os:
 
 DESCRIPTION
 This exports:
   - os.curdir is a string representing the current directory ('.' or ':')
   - os.pardir is a string representing the parent directory ('..' or '::')
 
 Explicitly documented as strings. If you want to work with strings,
 work with strings. If you want to work with bytes, don't use
 os.curdir, use bytes instead. Personally, I'm happy using strings, but
 if you want to go down the path of using bytes, you simply have to be
 consistent, and that probably means being platform-dependent anyway,
 so just use b. for the current directory.

I think we're all agreeing:  not all file systems are the same, and
Python doesn't smooth out all of the bumps, even for something that
seems as simple as displaying the names of files in a directory.  And
that's *after* we've agreed that filesystems contain files in
hierarchical directories.

Dan
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-07 Thread Chris Angelico
On Sun, Mar 8, 2015 at 3:54 AM, Marko Rauhamaa ma...@pacujo.net wrote:
 You can't operate on file names and text files using Python strings. Or
 at least, you will need to add (nontrivial) exception catching logic.

You can't operate on a JPG file using a Unicode string, nor an array
of integers. What of it? You can't operate on an array of integers
using a dictionary, either. So? How is this a failing of UTF-8?

If you really REALLY can't use the bytes() type to work with something
that is, yaknow, bytes, then you could use an alternative encoding
that has a value for every byte. It's still not Unicode text, so it
doesn't much matter which encoding you use. But it's much better to
use the bytes type to work with bytes. It is not text, so don't treat
it as text.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-07 Thread Marko Rauhamaa
Chris Angelico ros...@gmail.com:

 If you really REALLY can't use the bytes() type to work with something
 that is, yaknow, bytes, then you could use an alternative encoding
 that has a value for every byte. It's still not Unicode text, so it
 doesn't much matter which encoding you use. But it's much better to
 use the bytes type to work with bytes. It is not text, so don't treat
 it as text.

See:

   $ mkdir /tmp/xyz
   $ touch /tmp/xyz/$'\x80'
   $ python3
   Python 3.3.2 (default, Dec  4 2014, 12:49:00) 
   [GCC 4.8.3 20140911 (Red Hat 4.8.3-7)] on linux
   Type help, copyright, credits or license for more information.
import os
os.listdir('/tmp/xyz')
   ['\udc80']
open(os.listdir('/tmp/xyz')[0])
   Traceback (most recent call last):
 File stdin, line 1, in module
   FileNotFoundError: [Errno 2] No such file or directory: '\udc80'

File names encoded with Latin-X are quite commonplace even in UTF-8
locales.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-07 Thread Dan Sommers
On Sun, 08 Mar 2015 04:59:56 +1100, Chris Angelico wrote:

 On Sun, Mar 8, 2015 at 4:50 AM, Marko Rauhamaa ma...@pacujo.net wrote:

 Correct. Linux pathnames are octet strings regardless of the locale.

 That's why Linux developers should refer to filenames using bytes.
 Unfortunately, Python itself violates that principle by having
 os.listdir() return str objects (to mention one example).
 
 Only because you gave it a str with the path name. If you want to
 refer to file names using bytes, then be consistent and refer to ALL
 file names using bytes. As I demonstrated, that works just fine.

Python 3.4.2 (default, Oct  8 2014, 10:45:20) 
[GCC 4.9.1] on linux
Type help, copyright, credits or license for more information.
 import os
 type(os.listdir(os.curdir)[0])
class 'str'
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-07 Thread Mark Lawrence

On 07/03/2015 18:34, Dan Sommers wrote:

On Sun, 08 Mar 2015 05:13:09 +1100, Chris Angelico wrote:


On Sun, Mar 8, 2015 at 5:02 AM, Dan Sommers d...@tombstonezero.net wrote:

On Sun, 08 Mar 2015 04:59:56 +1100, Chris Angelico wrote:


On Sun, Mar 8, 2015 at 4:50 AM, Marko Rauhamaa ma...@pacujo.net wrote:



Correct. Linux pathnames are octet strings regardless of the locale.

That's why Linux developers should refer to filenames using bytes.
Unfortunately, Python itself violates that principle by having
os.listdir() return str objects (to mention one example).


Only because you gave it a str with the path name. If you want to
refer to file names using bytes, then be consistent and refer to ALL
file names using bytes. As I demonstrated, that works just fine.


Python 3.4.2 (default, Oct  8 2014, 10:45:20)
[GCC 4.9.1] on linux
Type help, copyright, credits or license for more information.

import os
type(os.listdir(os.curdir)[0])

class 'str'


Help on module os:

DESCRIPTION
 This exports:
   - os.curdir is a string representing the current directory ('.' or ':')
   - os.pardir is a string representing the parent directory ('..' or '::')

Explicitly documented as strings. If you want to work with strings,
work with strings. If you want to work with bytes, don't use
os.curdir, use bytes instead. Personally, I'm happy using strings, but
if you want to go down the path of using bytes, you simply have to be
consistent, and that probably means being platform-dependent anyway,
so just use b. for the current directory.


I think we're all agreeing:  not all file systems are the same, and
Python doesn't smooth out all of the bumps, even for something that
seems as simple as displaying the names of files in a directory.  And
that's *after* we've agreed that filesystems contain files in
hierarchical directories.

Dan



Isn't pathlib 
https://docs.python.org/3/library/pathlib.html#module-pathlib 
effectively a more recent attempt at smoothing or even removing (some 
of) the bumps?  Has anybody here got experience of it as I've never used it?


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-07 Thread Marko Rauhamaa
Mark Lawrence breamore...@yahoo.co.uk:

 On 07/03/2015 16:25, Marko Rauhamaa wrote:
 Here's an example:

 b = b'\x80'

 Yes, it generates an exception. IOW, UTF-8 is not a bijective mapping
 from str objects to bytes objects.

 Python 2 might, Python 3 doesn't.

   Python 3.3.2 (default, Dec  4 2014, 12:49:00) 
   [GCC 4.8.3 20140911 (Red Hat 4.8.3-7)] on linux
   Type help, copyright, credits or license for more information.
b'\x80'.decode('utf-8')
   Traceback (most recent call last):
 File stdin, line 1, in module
   UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0:
   invalid start byte


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-07 Thread Chris Angelico
On Sun, Mar 8, 2015 at 3:54 AM, Marko Rauhamaa ma...@pacujo.net wrote:
 All you've proven is that there are bit patterns which are not UTF-8
 streams...

 And that causes problems.

Demonstrate.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-07 Thread Marko Rauhamaa
Chris Angelico ros...@gmail.com:

 On Sun, Mar 8, 2015 at 4:14 AM, Marko Rauhamaa ma...@pacujo.net wrote:
 File names encoded with Latin-X are quite commonplace even in UTF-8
 locales.

 That is not a problem with UTF-8, though. I don't understand how
 you're blaming UTF-8 for that.

I'm saying it creates practical problems. There's a snake in the
paradise.

 There are two things happening here:

 1) The underlying file system is not UTF-8, and you can't depend on
 that,

Correct. Linux pathnames are octet strings regardless of the locale.

That's why Linux developers should refer to filenames using bytes.
Unfortunately, Python itself violates that principle by having
os.listdir() return str objects (to mention one example).

 2) You forgot to put the path on that, so it failed to find the file.
 Here's my version of your demo:

 open(/tmp/xyz/+os.listdir('/tmp/xyz')[0])
 _io.TextIOWrapper name='/tmp/xyz/\udc80' mode='r' encoding='UTF-8'

 Looks fine to me.

I stand corrected.

Then we have:

os.listdir()[0].encode('utf-8')
   Traceback (most recent call last):
 File stdin, line 1, in module
   UnicodeEncodeError: 'utf-8' codec can't encode character '\udc80' in
   position 0: surrogates not allowed


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-07 Thread Chris Angelico
On Sun, Mar 8, 2015 at 5:02 AM, Dan Sommers d...@tombstonezero.net wrote:
 On Sun, 08 Mar 2015 04:59:56 +1100, Chris Angelico wrote:

 On Sun, Mar 8, 2015 at 4:50 AM, Marko Rauhamaa ma...@pacujo.net wrote:

 Correct. Linux pathnames are octet strings regardless of the locale.

 That's why Linux developers should refer to filenames using bytes.
 Unfortunately, Python itself violates that principle by having
 os.listdir() return str objects (to mention one example).

 Only because you gave it a str with the path name. If you want to
 refer to file names using bytes, then be consistent and refer to ALL
 file names using bytes. As I demonstrated, that works just fine.

 Python 3.4.2 (default, Oct  8 2014, 10:45:20)
 [GCC 4.9.1] on linux
 Type help, copyright, credits or license for more information.
 import os
 type(os.listdir(os.curdir)[0])
 class 'str'

Help on module os:

DESCRIPTION
This exports:
  - os.curdir is a string representing the current directory ('.' or ':')
  - os.pardir is a string representing the parent directory ('..' or '::')

Explicitly documented as strings. If you want to work with strings,
work with strings. If you want to work with bytes, don't use
os.curdir, use bytes instead. Personally, I'm happy using strings, but
if you want to go down the path of using bytes, you simply have to be
consistent, and that probably means being platform-dependent anyway,
so just use b. for the current directory.

Normally, using Unicode strings for file names will work just fine.
Any name that you craft yourself will be correctly encoded for the
target file system (or UTF-8 if you can't know), and any that you get
back from os.listdir or equivalent will be usable in file name
contexts. What else can you do with a file name that isn't encoded the
way you expect it to be? Unless you have some out-of-band encoding
information, you can't do anything meaningful with the stream of
bytes, other than keeping it exactly as it is.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-07 Thread Albert-Jan Roskam


--- Original Message -

 From: Chris Angelico ros...@gmail.com
 To: 
 Cc: python-list@python.org python-list@python.org
 Sent: Saturday, March 7, 2015 6:26 PM
 Subject: Re: Newbie question about text encoding
 
 On Sun, Mar 8, 2015 at 4:14 AM, Marko Rauhamaa ma...@pacujo.net wrote:
  See:
 
 $ mkdir /tmp/xyz
 $ touch /tmp/xyz/
  \x80'
 $ python3
 Python 3.3.2 (default, Dec  4 2014, 12:49:00)
 [GCC 4.8.3 20140911 (Red Hat 4.8.3-7)] on linux
 Type help, copyright, credits or 
 license for more information.
  import os
  os.listdir('/tmp/xyz')
 ['\udc80']
  open(os.listdir('/tmp/xyz')[0])
 Traceback (most recent call last):
   File stdin, line 1, in module
 FileNotFoundError: [Errno 2] No such file or directory: 
 '\udc80'
 
  File names encoded with Latin-X are quite commonplace even in UTF-8
  locales.
 
 That is not a problem with UTF-8, though. I don't understand how
 you're blaming UTF-8 for that. There are two things happening here:
 
 1) The underlying file system is not UTF-8, and you can't depend on
 that, ergo the decode to Unicode has to have some special handling of
 failing bytes.
 2) You forgot to put the path on that, so it failed to find the file.
 Here's my version of your demo:
 
  open(/tmp/xyz/+os.listdir('/tmp/xyz')[0])
 _io.TextIOWrapper name='/tmp/xyz/\udc80' mode='r' 
 encoding='UTF-8'
 
 Looks fine to me.
 
 Alternatively, if you pass a byte string to os.listdir, you get back a
 list of byte string file names:
 
  os.listdir(b/tmp/xyz)

 [b'\x80']

Nice, I did not know that. And glob.glob works the same way: it returns a list 
of ustrings when given a ustring, and returns bstrings when given a bstring.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newbie question about text encoding

2015-03-07 Thread Dan Sommers
On Sat, 07 Mar 2015 19:00:47 +, Mark Lawrence wrote:

 Isn't pathlib
 https://docs.python.org/3/library/pathlib.html#module-pathlib
 effectively a more recent attempt at smoothing or even removing (some
 of) the bumps?  Has anybody here got experience of it as I've never
 used it?

I almost said something about Common Lisp's PATHNAME type, but I didn't.

An extremely quick reading of that page tells me that os.pathlib
addresses *some* of the issues that PATHNAME addresses, but os.pathlib
seems more limited in scope (e.g., os.pathlib doesn't account for
filesystems with versioned files).  I'll certainly have a closer look
later.
-- 
https://mail.python.org/mailman/listinfo/python-list


  1   2   3   4   5   6   7   8   9   10   >