[google-appengine] Re: different urlfetch behaviour on development and production servers: URLs with the characters: ( or )

2009-05-17 Thread Matt Trinneer

Hello,

Although I'm not sure why urllib2 resulted in successful retrieval of
URLs containing braces I have since come to discover that the issue
here lies not with Google App Engine but rather at dbpedia.  I have
since come to realize that I misdiagnosed the issue and thought I'd
update this thread indicating that.

If anyone is interested, the root cause seems to be a redirect (303
see other) taking place where the url  being redirected to is not
encoded.A direct request to the redirect URL, with encoding,
retrieves the intended document.

Matthew

On May 5, 11:07 pm, Matt Trinneer matt.trinn...@gmail.com wrote:
 Having some luck...  By using urllib2 instead of urlfetch I am able to
 load the same URLs on the production server without any issue.  Not
 really a solution per say but it gets the job done.  Appreciate
 everyone's feedback.

 On May 5, 10:29 pm, Matt Trinneer matt.trinn...@gmail.com wrote:



  Hi George,

  Thanks for the response.  I've done some additional testing and am not
  getting much further.  Unfortunately in this case I do not have
  control of the endpoint and am stuck with braces in the URL.

  Some additional notes which may be of use to anyone who happens upon
  this:

  1. The URLs being requested in this example return xml/rdf
  documents.
  2. In the case of requesting a resource without braces in it's URL a
  response similar to the following is received (truncated for brevity)

  ?xml version=1.0 encoding=utf-8 ?
  rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#;
  xmlns:rdfs=http://www.w3.org/2000/01/rdf-schema#;
  rdf:Description rdf:about=http://dbpedia.org/resource/Companion_
  %28manga%29./rdf:Description
  /rdf:RDF

  3. On the GAE production environment the response to a request for a
  URL with braces is not an error, but rather an empty rdf document.

  ?xml version=1.0 encoding=utf-8 ?
  rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#;
  xmlns:rdfs=http://www.w3.org/2000/01/rdf-schema#;
  /rdf:RDF

  4.  This lead me to speculate that the request being received by the
  remote host was not for the same resource as I believe I am making a
  request for.  So, with the help of another non-GAE endpoint I have
  been logging requests generated via urlfetch and am not able to see
  any appreciable difference between those sent by the development
  version, where these requests work, and the production version, where
  they don't.

  Continuing to investigate

  On May 5, 5:31 am, George george.z@gmail.com wrote:

   Ivan, Your problem looks like a common encoding problem. The default
   encoding used in server of GAE is ASCII, but something else such as
   UTF-8 on your computer. So the code works in your development
   environment but not on Google server.

   To deal with this problem you need to declare the encoding in file
   header and decode your string to unicode with the proper charset
   before using it. If you don't do this, the Python interpreter will
   help you to do it with the system default one. I agree this is a
   little confusing. Python should do it more elegantly.

   For Matthew's problem, sorry I also have no idea about it. urlfetch is
   a mystery in GAE libs. I found several examples working good in local
   but throwing error on server. So I can only suggest you avoid touching
   the dangerous zone like braces in url. :-)

   --
   George

   App Engine Unit Test Frameworkhttp://code.google.com/p/gaeunit/

   On May 4, 5:35 pm, Ivan Maslov vanya@gmail.com wrote:

I have similar problem. On development server function urlencode works
correctly with unicode string. In production error occurs:
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe2' in 
position
2: ordinal not in range(128). It occurs when i pass russian strings as
parameter.

2009/5/4 Matt Trinneer matt.trinn...@gmail.com

 To further that post...

 It seems to me that URLs containing characters such as ( and ) are not
 being fetched properly on the production environment.  I've attempted
 escaping the characters, as per RFC 3986.  However the escaped url
 (http://dbpedia.org/resource/Companion_%28manga%29)  doesn't fair any
 better.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: different urlfetch behaviour on development and production servers: URLs with the characters: ( or )

2009-05-05 Thread George

Ivan, Your problem looks like a common encoding problem. The default
encoding used in server of GAE is ASCII, but something else such as
UTF-8 on your computer. So the code works in your development
environment but not on Google server.

To deal with this problem you need to declare the encoding in file
header and decode your string to unicode with the proper charset
before using it. If you don't do this, the Python interpreter will
help you to do it with the system default one. I agree this is a
little confusing. Python should do it more elegantly.


For Matthew's problem, sorry I also have no idea about it. urlfetch is
a mystery in GAE libs. I found several examples working good in local
but throwing error on server. So I can only suggest you avoid touching
the dangerous zone like braces in url. :-)

--
George

App Engine Unit Test Framework
http://code.google.com/p/gaeunit/

On May 4, 5:35 pm, Ivan Maslov vanya@gmail.com wrote:
 I have similar problem. On development server function urlencode works
 correctly with unicode string. In production error occurs:
 UnicodeEncodeError: 'ascii' codec can't encode character u'\xe2' in position
 2: ordinal not in range(128). It occurs when i pass russian strings as
 parameter.

 2009/5/4 Matt Trinneer matt.trinn...@gmail.com



  To further that post...

  It seems to me that URLs containing characters such as ( and ) are not
  being fetched properly on the production environment.  I've attempted
  escaping the characters, as per RFC 3986.  However the escaped url
  (http://dbpedia.org/resource/Companion_%28manga%29)  doesn't fair any
  better.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: different urlfetch behaviour on development and production servers: URLs with the characters: ( or )

2009-05-05 Thread Matt Trinneer

Hi George,

Thanks for the response.  I've done some additional testing and am not
getting much further.  Unfortunately in this case I do not have
control of the endpoint and am stuck with braces in the URL.

Some additional notes which may be of use to anyone who happens upon
this:

1. The URLs being requested in this example return xml/rdf
documents.
2. In the case of requesting a resource without braces in it's URL a
response similar to the following is received (truncated for brevity)

?xml version=1.0 encoding=utf-8 ?
rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#;
xmlns:rdfs=http://www.w3.org/2000/01/rdf-schema#;
rdf:Description rdf:about=http://dbpedia.org/resource/Companion_
%28manga%29./rdf:Description
/rdf:RDF

3. On the GAE production environment the response to a request for a
URL with braces is not an error, but rather an empty rdf document.

?xml version=1.0 encoding=utf-8 ?
rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#;
xmlns:rdfs=http://www.w3.org/2000/01/rdf-schema#;
/rdf:RDF

4.  This lead me to speculate that the request being received by the
remote host was not for the same resource as I believe I am making a
request for.  So, with the help of another non-GAE endpoint I have
been logging requests generated via urlfetch and am not able to see
any appreciable difference between those sent by the development
version, where these requests work, and the production version, where
they don't.

Continuing to investigate

On May 5, 5:31 am, George george.z@gmail.com wrote:
 Ivan, Your problem looks like a common encoding problem. The default
 encoding used in server of GAE is ASCII, but something else such as
 UTF-8 on your computer. So the code works in your development
 environment but not on Google server.

 To deal with this problem you need to declare the encoding in file
 header and decode your string to unicode with the proper charset
 before using it. If you don't do this, the Python interpreter will
 help you to do it with the system default one. I agree this is a
 little confusing. Python should do it more elegantly.

 For Matthew's problem, sorry I also have no idea about it. urlfetch is
 a mystery in GAE libs. I found several examples working good in local
 but throwing error on server. So I can only suggest you avoid touching
 the dangerous zone like braces in url. :-)

 --
 George

 App Engine Unit Test Frameworkhttp://code.google.com/p/gaeunit/

 On May 4, 5:35 pm, Ivan Maslov vanya@gmail.com wrote:



  I have similar problem. On development server function urlencode works
  correctly with unicode string. In production error occurs:
  UnicodeEncodeError: 'ascii' codec can't encode character u'\xe2' in position
  2: ordinal not in range(128). It occurs when i pass russian strings as
  parameter.

  2009/5/4 Matt Trinneer matt.trinn...@gmail.com

   To further that post...

   It seems to me that URLs containing characters such as ( and ) are not
   being fetched properly on the production environment.  I've attempted
   escaping the characters, as per RFC 3986.  However the escaped url
   (http://dbpedia.org/resource/Companion_%28manga%29)  doesn't fair any
   better.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: different urlfetch behaviour on development and production servers: URLs with the characters: ( or )

2009-05-05 Thread Matt Trinneer

Having some luck...  By using urllib2 instead of urlfetch I am able to
load the same URLs on the production server without any issue.  Not
really a solution per say but it gets the job done.  Appreciate
everyone's feedback.

On May 5, 10:29 pm, Matt Trinneer matt.trinn...@gmail.com wrote:
 Hi George,

 Thanks for the response.  I've done some additional testing and am not
 getting much further.  Unfortunately in this case I do not have
 control of the endpoint and am stuck with braces in the URL.

 Some additional notes which may be of use to anyone who happens upon
 this:

 1. The URLs being requested in this example return xml/rdf
 documents.
 2. In the case of requesting a resource without braces in it's URL a
 response similar to the following is received (truncated for brevity)

 ?xml version=1.0 encoding=utf-8 ?
 rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#;
 xmlns:rdfs=http://www.w3.org/2000/01/rdf-schema#;
 rdf:Description rdf:about=http://dbpedia.org/resource/Companion_
 %28manga%29./rdf:Description
 /rdf:RDF

 3. On the GAE production environment the response to a request for a
 URL with braces is not an error, but rather an empty rdf document.

 ?xml version=1.0 encoding=utf-8 ?
 rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#;
 xmlns:rdfs=http://www.w3.org/2000/01/rdf-schema#;
 /rdf:RDF

 4.  This lead me to speculate that the request being received by the
 remote host was not for the same resource as I believe I am making a
 request for.  So, with the help of another non-GAE endpoint I have
 been logging requests generated via urlfetch and am not able to see
 any appreciable difference between those sent by the development
 version, where these requests work, and the production version, where
 they don't.

 Continuing to investigate

 On May 5, 5:31 am, George george.z@gmail.com wrote:



  Ivan, Your problem looks like a common encoding problem. The default
  encoding used in server of GAE is ASCII, but something else such as
  UTF-8 on your computer. So the code works in your development
  environment but not on Google server.

  To deal with this problem you need to declare the encoding in file
  header and decode your string to unicode with the proper charset
  before using it. If you don't do this, the Python interpreter will
  help you to do it with the system default one. I agree this is a
  little confusing. Python should do it more elegantly.

  For Matthew's problem, sorry I also have no idea about it. urlfetch is
  a mystery in GAE libs. I found several examples working good in local
  but throwing error on server. So I can only suggest you avoid touching
  the dangerous zone like braces in url. :-)

  --
  George

  App Engine Unit Test Frameworkhttp://code.google.com/p/gaeunit/

  On May 4, 5:35 pm, Ivan Maslov vanya@gmail.com wrote:

   I have similar problem. On development server function urlencode works
   correctly with unicode string. In production error occurs:
   UnicodeEncodeError: 'ascii' codec can't encode character u'\xe2' in 
   position
   2: ordinal not in range(128). It occurs when i pass russian strings as
   parameter.

   2009/5/4 Matt Trinneer matt.trinn...@gmail.com

To further that post...

It seems to me that URLs containing characters such as ( and ) are not
being fetched properly on the production environment.  I've attempted
escaping the characters, as per RFC 3986.  However the escaped url
(http://dbpedia.org/resource/Companion_%28manga%29)  doesn't fair any
better.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: different urlfetch behaviour on development and production servers: URLs with the characters: ( or )

2009-05-04 Thread Ivan Maslov
I have similar problem. On development server function urlencode works
correctly with unicode string. In production error occurs:
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe2' in position
2: ordinal not in range(128). It occurs when i pass russian strings as
parameter.

2009/5/4 Matt Trinneer matt.trinn...@gmail.com


 To further that post...

 It seems to me that URLs containing characters such as ( and ) are not
 being fetched properly on the production environment.  I've attempted
 escaping the characters, as per RFC 3986.  However the escaped url
 (http://dbpedia.org/resource/Companion_%28manga%29)  doesn't fair any
 better.


 


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: different urlfetch behaviour on development and production servers: URLs with the characters: ( or )

2009-05-04 Thread Tom Wu
string.encode(utf-8)

2009/5/4 Ivan Maslov vanya@gmail.com

 I have similar problem. On development server function urlencode works
 correctly with unicode string. In production error occurs:
 UnicodeEncodeError: 'ascii' codec can't encode character u'\xe2' in position
 2: ordinal not in range(128). It occurs when i pass russian strings as
 parameter.

 2009/5/4 Matt Trinneer matt.trinn...@gmail.com


 To further that post...

 It seems to me that URLs containing characters such as ( and ) are not
 being fetched properly on the production environment.  I've attempted
 escaping the characters, as per RFC 3986.  However the escaped url
 (http://dbpedia.org/resource/Companion_%28manga%29)  doesn't fair any
 better.





 


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---



[google-appengine] Re: different urlfetch behaviour on development and production servers: URLs with the characters: ( or )

2009-05-03 Thread Matt Trinneer

To further that post...

It seems to me that URLs containing characters such as ( and ) are not
being fetched properly on the production environment.  I've attempted
escaping the characters, as per RFC 3986.  However the escaped url
(http://dbpedia.org/resource/Companion_%28manga%29)  doesn't fair any
better.


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Google App Engine group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~--~~~~--~~--~--~---