Re: CDATA in MarcXML

2012-10-30 Thread Giovanni Di Milia
Hi all (again),
I made some tests and I modified the code of "encode_for_xml" in the
following way and it seem to work fine:

def encode_for_xml(text, wash=False, xml_version='1.0', quote=False):
"""Encodes special characters in a text so that it would be
XML-compliant.
@param text: text to encode
@return: an encoded text"""
text = text.replace('&', '&')
text = text.replace('<', '<')
text = text.replace('>', '>')
if quote:
text = text.replace('"', '"')
if wash:
text = wash_for_xml(text, xml_version=xml_version)
return text

I repeat that I don't know why all the XML special characters are not
escaped, but even this solution looks semantically wrong to me,
because it doesn't follow the W3C guidelines:
http://www.w3.org/TR/xml/#syntax

A correct function should escape in this way:
"   "
'   '
<   <
>   >
&   &

while a CDATA section should not be escaped,
but at least now the XML generated (and stored in bibfmt) is valid.

Thank for your help,
Giovanni


--
Giovanni Di Milia
IT Specialist at SAO/NASA ADS
Harvard-Smithsonian Center for Astrophysics
60 Garden Street, MS 83
Cambridge, MA 02138 USA
email: gdimi...@cfa.harvard.edu
--

On Tue, Oct 30, 2012 at 2:44 PM, Giovanni Di Milia
 wrote:
> Hi all,
> here at ADS we have a problem with some metadata that contain CDATA elements.
> The problem is caused by the export procedure of Invenio that doesn't
> properly encode these elements.
>
> What happens is that all the elements like
> ''
> are converted to
> ''
> and this in XML is an error.
>
> After reading a very similar discussion from 2010 (started by Benoit),
> I suppose that the problem is still in
> invenio.textutils.encode_for_xml()
> which is used in
> bibformat_utils.record_get_xml().
>
> I honestly don't understand why all the tags inside a subflield are
> not escaped (but I suppose there is a good reason) but in case of
> CDATA the tag should be completely escaped.
>
> Thanks for your help,
>
> Giovanni
>
>
>
>
> --
> Giovanni Di Milia
> IT Specialist at SAO/NASA ADS
> Harvard-Smithsonian Center for Astrophysics
> 60 Garden Street, MS 83
> Cambridge, MA 02138 USA
> email: gdimi...@cfa.harvard.edu
> --


CDATA in MarcXML

2012-10-30 Thread Giovanni Di Milia
Hi all,
here at ADS we have a problem with some metadata that contain CDATA elements.
The problem is caused by the export procedure of Invenio that doesn't
properly encode these elements.

What happens is that all the elements like
''
are converted to
''
and this in XML is an error.

After reading a very similar discussion from 2010 (started by Benoit),
I suppose that the problem is still in
invenio.textutils.encode_for_xml()
which is used in
bibformat_utils.record_get_xml().

I honestly don't understand why all the tags inside a subflield are
not escaped (but I suppose there is a good reason) but in case of
CDATA the tag should be completely escaped.

Thanks for your help,

Giovanni




--
Giovanni Di Milia
IT Specialist at SAO/NASA ADS
Harvard-Smithsonian Center for Astrophysics
60 Garden Street, MS 83
Cambridge, MA 02138 USA
email: gdimi...@cfa.harvard.edu
--


Wrong error msg in "Submit" tab when no submit role is connected to a user

2012-10-30 Thread Theodoros Theodoropoulos

Hello everyone,

When a logged-in user has NO sumbit permissions (to any doctype) and 
tries to click the submit button (on top), he gets:
"Account 'x...@yyy.zzz' is not yet activated. Try to login 
 with 
another account"


Which is misleading... He should get an error that says "You are not 
authorized to perform this action."
Probably, for registered but not-yet-activated accounts an extra check 
should be performed...


Can you verify it, or did i break something?

Best regards,
Theodoros


ps. Call is initiated in websubmit_webinterface.py
[...]
if not at_least_one_submission_authorized and 
submission_exists:


if isGuestUser(uid):
return redirect_to_url(req, 
"%s/youraccount/login%s" % (

CFG_SITE_SECURE_URL,
make_canonical_urlargd({'referer' : 
CFG_SITE_SECURE_URL + req.unparsed_uri, 'ln' : args['ln']}, {}))

, norobot=True)
else:

return page_not_authorized(req, "../submit",
  # this is executed

   uid=uid,
navmenuid='submit')
return home(req,catalogues_text, c,ln)

and then page_not_authorized is called:
[...]
if res and res[0][0]:
if text:
body = text
else:
body = "%s %s" % (CFG_WEBACCESS_WARNING_MSGS[9] % 
cgi.escape(res[0][0]),# this is executed
  ("%s %s" % (CFG_WEBACCESS_MSGS[0] 
% urllib.quote(referer), CFG_WEBACCESS_MSGS[1])))

[...]

but from access_control_config: CFG_WEBACCESS_WARNING_MSGS[9] = Account 
'%s' is not yet activated.