Re: slow xalan transformation

2002-06-07 Thread KOZLOV Roman

It's possible to import/read documents encoded in "ISO-8859-1" (I put french
accented characters) and in "Windows-1251" (russian) into Xindice db. So I think
it is also possible for the latin2 documents - just set proper system locale and
xml encoding attribute. However it is not possible to use such characters in
xpath expressions for queries (queries containing ASCII characters only works
fine even results contain different languages - it is UTF-8 always as well as
inside Xindice). I've read that it is because of CORBA restrictions.

Roman

Adrian Petru Dimulescu wrote:

> ... I am also having trouble with XIndice in what concerns the specific
> ISO-8859-2 characters. apparently, importing such a latin2 document into
> XIndice makes me lose those characters...
>
> On Monday 03 June 2002 15:47, KOZLOV Roman wrote:
> > Unfortunatelly, Xindice has a very hard restriction on queries: xpath
> > expression cann't contain non-ASCII characters.
> >
> > Roman
> >
> > Adrian Petru Dimulescu wrote:
> > > > I installed today a cvs cocoon on a Tomcat 4.0.3 / jdk 1.3.1_01 and it
> > > > works fine if it weren't for the slow xslt transformation.
> > >
> > > A bit late as self-response but here it goes: extracting sub-trees with
> > > XSLT is not really a sign of genius as long as tools as Xindice exist.
> > >
> > > so a native xml database solves the problem -- two second-average time
> > > per chapter extraction --- i'll play some more with indexes maybe it can
> > > get even better.
> > >
> > > and as if it weren't enough, XIndice is just perfectly integrated into
> > > Cocoon...
> > >
> > > -
> > > Please check that your question has not already been answered in the
> > > FAQ before posting. 
> > >
> > > To unsubscribe, e-mail: <[EMAIL PROTECTED]>
> > > For additional commands, e-mail: <[EMAIL PROTECTED]>
> >
> > -
> > Please check that your question has not already been answered in the
> > FAQ before posting. 
> >
> > To unsubscribe, e-mail: <[EMAIL PROTECTED]>
> > For additional commands, e-mail: <[EMAIL PROTECTED]>
>
> -
> Please check that your question has not already been answered in the
> FAQ before posting. 
>
> To unsubscribe, e-mail: <[EMAIL PROTECTED]>
> For additional commands, e-mail: <[EMAIL PROTECTED]>


-
Please check that your question has not already been answered in the
FAQ before posting. 

To unsubscribe, e-mail: <[EMAIL PROTECTED]>
For additional commands, e-mail: <[EMAIL PROTECTED]>




Re: slow xalan transformation

2002-06-06 Thread Adrian Petru Dimulescu

... I am also having trouble with XIndice in what concerns the specific 
ISO-8859-2 characters. apparently, importing such a latin2 document into 
XIndice makes me lose those characters...


On Monday 03 June 2002 15:47, KOZLOV Roman wrote:
> Unfortunatelly, Xindice has a very hard restriction on queries: xpath
> expression cann't contain non-ASCII characters.
>
> Roman
>
> Adrian Petru Dimulescu wrote:
> > > I installed today a cvs cocoon on a Tomcat 4.0.3 / jdk 1.3.1_01 and it
> > > works fine if it weren't for the slow xslt transformation.
> >
> > A bit late as self-response but here it goes: extracting sub-trees with
> > XSLT is not really a sign of genius as long as tools as Xindice exist.
> >
> > so a native xml database solves the problem -- two second-average time
> > per chapter extraction --- i'll play some more with indexes maybe it can
> > get even better.
> >
> > and as if it weren't enough, XIndice is just perfectly integrated into
> > Cocoon...
> >
> > -
> > Please check that your question has not already been answered in the
> > FAQ before posting. 
> >
> > To unsubscribe, e-mail: <[EMAIL PROTECTED]>
> > For additional commands, e-mail: <[EMAIL PROTECTED]>
>
> -
> Please check that your question has not already been answered in the
> FAQ before posting. 
>
> To unsubscribe, e-mail: <[EMAIL PROTECTED]>
> For additional commands, e-mail: <[EMAIL PROTECTED]>


-
Please check that your question has not already been answered in the
FAQ before posting. 

To unsubscribe, e-mail: <[EMAIL PROTECTED]>
For additional commands, e-mail: <[EMAIL PROTECTED]>




Re: slow xalan transformation

2002-06-03 Thread KOZLOV Roman

Unfortunatelly, Xindice has a very hard restriction on queries: xpath expression
cann't contain non-ASCII characters.

Roman

Adrian Petru Dimulescu wrote:

> > I installed today a cvs cocoon on a Tomcat 4.0.3 / jdk 1.3.1_01 and it
> > works fine if it weren't for the slow xslt transformation.
>
> A bit late as self-response but here it goes: extracting sub-trees with XSLT
> is not really a sign of genius as long as tools as Xindice exist.
>
> so a native xml database solves the problem -- two second-average time per
> chapter extraction --- i'll play some more with indexes maybe it can get even
> better.
>
> and as if it weren't enough, XIndice is just perfectly integrated into
> Cocoon...
>
> -
> Please check that your question has not already been answered in the
> FAQ before posting. 
>
> To unsubscribe, e-mail: <[EMAIL PROTECTED]>
> For additional commands, e-mail: <[EMAIL PROTECTED]>


-
Please check that your question has not already been answered in the
FAQ before posting. 

To unsubscribe, e-mail: <[EMAIL PROTECTED]>
For additional commands, e-mail: <[EMAIL PROTECTED]>




Re: slow xalan transformation

2002-05-31 Thread Adrian Petru Dimulescu

> I installed today a cvs cocoon on a Tomcat 4.0.3 / jdk 1.3.1_01 and it
> works fine if it weren't for the slow xslt transformation.

A bit late as self-response but here it goes: extracting sub-trees with XSLT 
is not really a sign of genius as long as tools as Xindice exist.

so a native xml database solves the problem -- two second-average time per 
chapter extraction --- i'll play some more with indexes maybe it can get even 
better.

and as if it weren't enough, XIndice is just perfectly integrated into 
Cocoon...


-
Please check that your question has not already been answered in the
FAQ before posting. 

To unsubscribe, e-mail: <[EMAIL PROTECTED]>
For additional commands, e-mail: <[EMAIL PROTECTED]>




RE: slow xalan transformation

2002-05-22 Thread Vadim Gritsenko

> From: Adrian Petru Dimulescu [mailto:[EMAIL PROTECTED]]
> 
> Hello,
> 
> I imagine a search engine which would propose several relevant
paragraphs in
> several books of a small digital libraries. The user woud click on the
first
> result and cocoon serves the 2nd paragraph of the 3rd chapter of
Genesis of
> the Old Testament of the Bible
> (in other words: http://url/bible?select=1/1/3/2 )
> 
> It would be then nice to be able serve a subtree of my choice.
> 
> I would also like the user to be able to browse a tree-like structure
of this
> small digital library. The user opens the "Bible" node and then the
Genesis
> node and then its first subchapter. Then he closes everything up and
opens
> the, say, Shakespeare node, and then Macbeth and then its acts.
> 
> The idea, I guess, is to build an arborescent structure of books.
> 
> *
> I am no expert in XSLT processing but somehow I took for granted that
> isolating a *small* subtree of a *big* XML should not mean actually
scanning
> the whole big file.

It does mean exactly this, unless you help it with predicate like [1] to
extract only first occurrence. Then it could be possible to scan only
from the beginning of the file to the end of the desired chapter, and
not all file.


> I suspect however that Xalan validates the XML before doing the
transformation

No, Xalan does not validates XML. And Xerces also does not validates XML
- with default Cocoon settings. 


> which is slow because then the *entire* file has to be actually
scanned. In

It has to be scanned because of other reasons.


> that sense, I wonder if it is possible to disable validating.

It is disabled by default.

Vadim


> 
> Best regards,
> Adrian.
> 
> 
> 
> On Sunday 19 May 2002 21:51, William Brogden wrote:
> > Scanning the entire bible to pick a chapter seems very
> > wasteful. If you never serve more than one chapter, why
> > not store the chapters separately?



-
Please check that your question has not already been answered in the
FAQ before posting. 

To unsubscribe, e-mail: <[EMAIL PROTECTED]>
For additional commands, e-mail: <[EMAIL PROTECTED]>




Re: slow xalan transformation

2002-05-19 Thread Adrian Petru Dimulescu

Hello,

I imagine a search engine which would propose several relevant paragraphs in 
several books of a small digital libraries. The user woud click on the first 
result and cocoon serves the 2nd paragraph of the 3rd chapter of Genesis of 
the Old Testament of the Bible 
(in other words: http://url/bible?select=1/1/3/2 )

It would be then nice to be able serve a subtree of my choice.

I would also like the user to be able to browse a tree-like structure of this 
small digital library. The user opens the "Bible" node and then the Genesis 
node and then its first subchapter. Then he closes everything up and opens 
the, say, Shakespeare node, and then Macbeth and then its acts.

The idea, I guess, is to build an arborescent structure of books.

*
I am no expert in XSLT processing but somehow I took for granted that 
isolating a *small* subtree of a *big* XML should not mean actually scanning 
the whole big file.

I suspect however that Xalan validates the XML before doing the transformation 
which is slow because then the *entire* file has to be actually scanned. In 
that sense, I wonder if it is possible to disable validating.

Best regards,
Adrian.



On Sunday 19 May 2002 21:51, William Brogden wrote:
> Scanning the entire bible to pick a chapter seems very
> wasteful. If you never serve more than one chapter, why
> not store the chapters separately?


-
Please check that your question has not already been answered in the
FAQ before posting. 

To unsubscribe, e-mail: <[EMAIL PROTECTED]>
For additional commands, e-mail: <[EMAIL PROTECTED]>




RE: slow xalan transformation

2002-05-19 Thread William Brogden

Scanning the entire bible to pick a chapter seems very
wasteful. If you never serve more than one chapter, why
not store the chapters separately?

[EMAIL PROTECTED]
Author of Soap Programming with Java - Sybex; ISBN: 0782129285



> -Original Message-
> From: Adrian Petru Dimulescu [mailto:[EMAIL PROTECTED]] 
> Sent: Sunday, May 19, 2002 2:44 PM
> To: [EMAIL PROTECTED]
> Subject: slow xalan transformation
> 
> 
> Hello,
> 
> I installed today a cvs cocoon on a Tomcat 4.0.3 / jdk 
> 1.3.1_01 and it works 
> fine if it weren't for the slow xslt transformation.
> 
> What do i mean by slow? 
> 
> I want to make HTML versions of Bible chapters. In order to 
> do that, I have a 
> main biblie.xml file which includes all its chapters using 
> entities. I apply 
> a stylesheet which simply selects a chapter and then another 
> stylesheet which 
> transforms this selection to HTML markup.
> 
> it should be said that the xml file (with its chapters) is no 
> small xml file 
> (it is the Bible after all).
> 
> the relevant sitemap.conf part:
> 
>
> 
>  src="carti/stylesheets/tei-select-subdiv.xsl">
>name="use-request-parameters" value="true"/>
> 
> 
> 
>
> 
> 
> I measured the time it takes several xslt processors on my machine 
> (Thunderbird 1.2 GHz, 250MB RAM) to isolate a chapter of the 
> Bible (say, 
> Genesis or Matthew)  
> 
> * xalan2 (Java):  40 seconds
> * saxon6.5 (Java):  18 seconds
> * xsltproc (C): 3 seconds !
> 
> Under jdk1.4 xalan is catastrophic: it takes more than 2 
> minutes to do this 
> transformation.
> 
> Now, my question is: do you think these times are normal, do 
> they include DTD 
> validation and if they do how can I disable DTD validation?
> 
> Is there a Java solution to this problem (other than writing 
> a TraxTransformer 
> implementaion which would simply execute xsltproc?)
> 
> ==
> Thank you,
> Adrian Petru Dimulescu.
> 
> 
> 
> * * *
> 
> PS: 
> Here is a sketch of the biblie.xml:
> 
>  
>  "/home/dadi/xml/dtds/tei/myPizza.dtd" [
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Vechiul Testament
> 
> &facerea;
> &iesirea;
> 
> 
> 
> 
> 
> 
> 
> 
> 



-
Please check that your question has not already been answered in the
FAQ before posting. 

To unsubscribe, e-mail: <[EMAIL PROTECTED]>
For additional commands, e-mail: <[EMAIL PROTECTED]>