[jira] [Comment Edited] (XERCESJ-1595) Xerces grammar pool not working with no namespace XSDs

Radu Coravu (JIRA) Wed, 19 Feb 2014 04:36:13 -0800

    [ 
https://issues.apache.org/jira/browse/XERCESJ-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13500918#comment-13500918
 ]


Radu Coravu edited comment on XERCESJ-1595 at 2/19/14 12:35 PM:
----------------------------------------------------------------

Hi Eric,

1.a)
Basicaly the equals method would need some additional checking if the 
fLiteralSystemID is not NULL, probably the code could look like:
{code}
  /**
   * Compares this grammar with the given grammar. Currently, we compare 
   * the target namespaces.
   * 
   * @param descObj The description of the grammar to be compared with
   * @return        True if they are equal, else false
   */
  public boolean equals(Object descObj) {
      if(!(descObj instanceof XMLSchemaDescription)) return false;
      XMLSchemaDescription desc = (XMLSchemaDescription)descObj;
      if(fNamespace == null && desc.getTargetNamespace() == null
          && fLiteralSystemId != null && desc.getLiteralSystemId() != null) {
        return fLiteralSystemId.equals(desc.getLiteralSystemId());
      } else if (fNamespace != null)
          return fNamespace.equals(desc.getTargetNamespace());
      else // fNamespace == null
          return desc.getTargetNamespace() == null;
  }
{code}
1.b) The "hashcode" maybe could prepend some static text to the 
fLiteralSystemID String, something like:
{code}
 return (fNamespace == null) ? ("Literal: " + fLiteralSystemId).hashCode() : 
fNamespace.hashCode(); //added fLiteralSystemId.hashCode() after ? instead of 
existing 0
{code}
In this way we would avoid having the same hash code for a description which 
has a target namespace and for a description which has a literal system ID 
equal to the namespace of the first one (although not very likely).

2) Is the field "fExternalNoNamespaceSchema" the right field for the job? If 
you add some System.err's, does it always match the schema location of the 
associated XML Schema?
Because the field seems to be set only external from a Xerces property.
The patch I tried was something like:
{code}
              fXSDDescription.setNamespace(namespace);
              Hashtable locationPairs = fLocationPairs;
              Object locationArray =
                  locationPairs.get(namespace == null ? XMLSymbols.EMPTY_STRING 
: namespace);
              if (locationArray != null) {
                String[] temp = ((XMLSchemaLoader.LocationArray) 
locationArray).getLocationArray();
                if (temp.length != 0) {
                  fXSDDescription.setLiteralSystemId(temp[0]);
                }
              }
{code}
I also preserved the old literal system ID from the field and set it back to 
the field after the grammar was searched in the pool in order to avoid side 
effects.

One more thing, what we do in Oxygen in order to patch a library is to create a 
small JAR file containing only the patched classes, in this case the 
"org.apache.xerces.impl.xs.XSDDescription" and 
"org.apache.xerces.impl.xs.XMLSchemaValidator" classes.
If the Java class path contains this small JAR before the xercesImpl.jar, it 
will load the patches from the small JAR.
In this way you avoid modifying the main xercesImpl.jar.


was (Author: radu_coravu):
Hi Eric,

1.a)
Basicaly the equals method would need some additional checking if the 
fLiteralSystemID is not NULL, probably the code could look like:

  /**
   * Compares this grammar with the given grammar. Currently, we compare 
   * the target namespaces.
   * 
   * @param descObj The description of the grammar to be compared with
   * @return        True if they are equal, else false
   */
  public boolean equals(Object descObj) {
      if(!(descObj instanceof XMLSchemaDescription)) return false;
      XMLSchemaDescription desc = (XMLSchemaDescription)descObj;
      if(fNamespace == null && desc.getTargetNamespace() == null
          && fLiteralSystemId != null && desc.getLiteralSystemId() != null) {
        return fLiteralSystemId.equals(desc.getLiteralSystemId());
      } else if (fNamespace != null)
          return fNamespace.equals(desc.getTargetNamespace());
      else // fNamespace == null
          return desc.getTargetNamespace() == null;
  }

1.b) The "hashcode" maybe could prepend some static text to the 
fLiteralSystemID String, something like:

 return (fNamespace == null) ? ("Literal: " + fLiteralSystemId).hashCode() : 
fNamespace.hashCode(); //added fLiteralSystemId.hashCode() after ? instead of 
existing 0

In this way we would avoid having the same hash code for a description which 
has a target namespace and for a description which has a literal system ID 
equal to the namespace of the first one (although not very likely).

2) Is the field "fExternalNoNamespaceSchema" the right field for the job? If 
you add some System.err's, does it always match the schema location of the 
associated XML Schema?
Because the field seems to be set only external from a Xerces property.
The patch I tried was something like:

              fXSDDescription.setNamespace(namespace);
              Hashtable locationPairs = fLocationPairs;
              Object locationArray =
                  locationPairs.get(namespace == null ? XMLSymbols.EMPTY_STRING 
: namespace);
              if (locationArray != null) {
                String[] temp = ((XMLSchemaLoader.LocationArray) 
locationArray).getLocationArray();
                if (temp.length != 0) {
                  fXSDDescription.setLiteralSystemId(temp[0]);
                }
              }

I also preserved the old literal system ID from the field and set it back to 
the field after the grammar was searched in the pool in order to avoid side 
effects.

One more thing, what we do in Oxygen in order to patch a library is to create a 
small JAR file containing only the patched classes, in this case the 
"org.apache.xerces.impl.xs.XSDDescription" and 
"org.apache.xerces.impl.xs.XMLSchemaValidator" classes.
If the Java class path contains this small JAR before the xercesImpl.jar, it 
will load the patches from the small JAR.
In this way you avoid modifying the main xercesImpl.jar.

> Xerces grammar pool not working with no namespace XSDs
> ------------------------------------------------------
>
>                 Key: XERCESJ-1595
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1595
>             Project: Xerces2-J
>          Issue Type: Bug
>          Components: XNI
>    Affects Versions: 2.11.0
>            Reporter: Eric Sirois
>            Assignee: Michael Glavassevich
>
> When loading XSDs with no namespace into the Xerces grammar pool, it assumes 
> every XSD with no namespace to be the same XSD doc.  In our case, when trying 
> to validate DITA documents, there could be a number of different types of 
> documents involved, for example map, topic, task and concepts.
> To get around the issue we need to ignore XSDs when loading the grammar pool. 
>  This affects the time needed to produce/transform documentation 
> significantly.
> If a namespace is available for the XSD, the grammar cache should use that. 
> If the there is no namespace available, then it should use the system ID of 
> the schema.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (XERCESJ-1595) Xerces grammar pool not working with no namespace XSDs

Reply via email to