Re: FOP, Index and duplicate Page number

2010-06-18 Thread Giuseppe Briotti
2010/6/18 Georg Datterl :
> Hi Guiseppe,
>
> This is one line of my index:
> 
>  
>    
>      
>        Indexed Word
>      
>    
>  
>  
>    
>      
>        
>           id="BE_38923893">
>          , 
>           id="BE_38923894">
>          , 
>           id="BE_38923895">
>          , 
>           id="BE_38923896">
>        
>      
>    
>  
> 
>
> As you can see, each fo:page-number-citation has an id. In my java code I 
> have a Hashtable<"wiring word", Vector<"ids">>.
>
> Transforming the fo file to area tree results in an XML where the IDs can be 
> found again, together with the real page numbers. I transfer the XML to a dom 
> document and use XPath evaluation to get the page numbers for each id from 
> the hashtable:
>
> String val = xPath.evaluate(".//te...@prod-id='BE_38923893']/word/text()", 
> root, XPathConstants.STRING).toString();
>
> val now contains the page number for the page-number-citation BE_38923893. If 
> val for BE_38923894 contains the same page number, I know I can remove the 
> whole
>
>  , 
>   id="BE_38923894">
>
> from my fo file. This only works, of course, because I don't have to rely on 
> transformations. But I'm quite sure the XSLT experts could come up with a 
> transformation based solution as well. If you write out your fo file from 
> code as well, I can give you some optimization hints as well, just ask then.
>
> Regards,
>
> Georg Datterl

Well, it sounds good... I start from an XML file that is basically a
speech report: thus there are several people speech. I can put an id
on each tag opening the speech and follow the same approach, the only
thing to do is extract the area tree results in XML and parse the
result... I will try and post the results/problem for further
considerations.

Working on FOP Project can be a nice thing too :-)

Thanks

G.

-- 

Giuseppe Briotti
g.brio...@gmail.com

"Alme Sol, curru nitido diem qui
promis et celas aliusque et idem
nasceris, possis nihil urbe Roma
visere maius."
(Orazio)

-
To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org



AW: FOP, Index and duplicate Page number

2010-06-18 Thread Georg Datterl
Hi Guiseppe,

This is one line of my index:

  

  
Indexed Word
  

  
  

  

  
  , 
  
  , 
  
  , 
  

  

  


As you can see, each fo:page-number-citation has an id. In my java code I have 
a Hashtable<"wiring word", Vector<"ids">>.

Transforming the fo file to area tree results in an XML where the IDs can be 
found again, together with the real page numbers. I transfer the XML to a dom 
document and use XPath evaluation to get the page numbers for each id from the 
hashtable:

String val = xPath.evaluate(".//te...@prod-id='BE_38923893']/word/text()", 
root, XPathConstants.STRING).toString();

val now contains the page number for the page-number-citation BE_38923893. If 
val for BE_38923894 contains the same page number, I know I can remove the whole

  , 
  

from my fo file. This only works, of course, because I don't have to rely on 
transformations. But I'm quite sure the XSLT experts could come up with a 
transformation based solution as well. If you write out your fo file from code 
as well, I can give you some optimization hints as well, just ask then.

Regards,

Georg Datterl

-- Kontakt --

Georg Datterl

Geneon media solutions gmbh
Gutenstetter Straße 8a
90449 Nürnberg

HRB Nürnberg: 17193
Geschäftsführer: Yong-Harry Steiert

Tel.: 0911/36 78 88 - 26
Fax: 0911/36 78 88 - 20

www.geneon.de

Weitere Mitglieder der Willmy MediaGroup:

IRS Integrated Realization Services GmbH:www.irs-nbg.de
Willmy PrintMedia GmbH:www.willmy.de
Willmy Consult & Content GmbH: www.willmycc.de


-Ursprüngliche Nachricht-
Von: Giuseppe Briotti [mailto:g.brio...@gmail.com]
Gesendet: Freitag, 18. Juni 2010 11:26
An: fop-users@xmlgraphics.apache.org
Betreff: Re: FOP, Index and duplicate Page number

Hi Pascal, hi Georg, thanks for your replies (very fast!).

Well, I can partecipate to FOP community trying to implement such
feature, but I don't know if I have the basic knowledge to do that:
I'm experienced developer in Java, C# and C/C++ thus probably I have
the "developing knowledge" but I don't know much about the FOP
implementation itself... can you suggest a starting point?

As symptom to my low knowledge about FOP, I have some doubt about the
Georg solution implementation... how can I do this: "Then I generate
the area tree for the document and look up the content of each ref-id
block. Finding blocks with identical content I delete all but one." I
agree that theory shows that this is the only available solution at
the moment, but I didn't find an example (well, I'm still searching,
of course: your answers were too fast! ;-) ).

So, in brief:

1. I have some time to spent in FOP Project collaboration and this
implementation probably is a good test for me, but I need a
documentation starting point (I've seen the wiki and travel through
the source code...);

2. the approach suggested by Georg sounds good for me, and again I
need an example, but probably I didn't search enough :-)

Thanks

G.

--

Giuseppe Briotti
g.brio...@gmail.com

"Alme Sol, curru nitido diem qui
promis et celas aliusque et idem
nasceris, possis nihil urbe Roma
visere maius."
(Orazio)

-
To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org


-
To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org



Re: FOP, Index and duplicate Page number

2010-06-18 Thread Giuseppe Briotti
Hi Pascal, hi Georg, thanks for your replies (very fast!).

Well, I can partecipate to FOP community trying to implement such
feature, but I don't know if I have the basic knowledge to do that:
I'm experienced developer in Java, C# and C/C++ thus probably I have
the "developing knowledge" but I don't know much about the FOP
implementation itself... can you suggest a starting point?

As symptom to my low knowledge about FOP, I have some doubt about the
Georg solution implementation... how can I do this: "Then I generate
the area tree for the document and look up the content of each ref-id
block. Finding blocks with identical content I delete all but one." I
agree that theory shows that this is the only available solution at
the moment, but I didn't find an example (well, I'm still searching,
of course: your answers were too fast! ;-) ).

So, in brief:

1. I have some time to spent in FOP Project collaboration and this
implementation probably is a good test for me, but I need a
documentation starting point (I've seen the wiki and travel through
the source code...);

2. the approach suggested by Georg sounds good for me, and again I
need an example, but probably I didn't search enough :-)

Thanks

G.

-- 

Giuseppe Briotti
g.brio...@gmail.com

"Alme Sol, curru nitido diem qui
promis et celas aliusque et idem
nasceris, possis nihil urbe Roma
visere maius."
(Orazio)

-
To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org



AW: FOP, Index and duplicate Page number

2010-06-18 Thread Georg Datterl
Hi Pascal,

I did not understand your answer to 3. How can I find two indexed words on the 
same page before fop inserted page breaks?

Hi Guiseppe,

I use a two-pass approach. I keep a list of indexed words and their ref-ids. 
Then I generate the area tree for the document and look up the content of each 
ref-id block. Finding blocks with identical content I delete all but one. Then 
I generate the pdf again. This works quite nice, since I have the complete 
fo-file available (and the index page in memory) and since the index always 
starts on a new page. It would even work if the index was on the front of the 
publication, since the relative position between indexed words would not change 
(= No new pagebreaks will be inserted). But of course it's twice the work for 
the fop engine.

Regards,

Georg Datterl

-- Kontakt --

Georg Datterl

Geneon media solutions gmbh
Gutenstetter Straße 8a
90449 Nürnberg

HRB Nürnberg: 17193
Geschäftsführer: Yong-Harry Steiert

Tel.: 0911/36 78 88 - 26
Fax: 0911/36 78 88 - 20

www.geneon.de

Weitere Mitglieder der Willmy MediaGroup:

IRS Integrated Realization Services GmbH:www.irs-nbg.de
Willmy PrintMedia GmbH:www.willmy.de
Willmy Consult & Content GmbH: www.willmycc.de

-Ursprüngliche Nachricht-
Von: Pascal Sancho [mailto:pascal.san...@takoma.fr]
Gesendet: Freitag, 18. Juni 2010 09:51
An: fop-users@xmlgraphics.apache.org
Betreff: Re: FOP, Index and duplicate Page number

Hi Giuseppe,

1. no;
2. no, but any help is welcome to help in implementation of such
feature; FOP is open source!
3 yes: using a 2 pass XSLT (no format here, but you can apply your own):


my text with an indexed word.





  




  

  



  

  
  : 
  
  

  



  

  
  , 



  

 
  


HTH,

Pascal



Le 18/06/2010 00:47, Giuseppe Briotti a écrit :
> Hi all, I need to work on Index for very large document.
>
> The Index is formatted like this:
>
> Albert ... Pag. 4, 5
> Alfred 6, 10, 11
>
> It works fine, but sometimes there are several reference on the same
> pages. The results it is not so good:
>
> Albert ... Pag. 4, 5, 5, 5
> Alfred 6, 10, 10, 10, 11, 11
>
> It seems from the FO specs that a best result like this:
>
> Albert ... Pag. 4, 5
> Alfred 6, 10, 11
>
> Can be achieved via a function equipped in XSL1.1. like
> merge-*-index-key-reference. Unfortunately this function seems not to
> be supported by FOP, as per
> http://xmlgraphics.apache.org/fop/compliance.html.
>
> So, the questions are:
>
> 1. I'm missing something?
> 2. there will be a scheduled evolution to achieve the support for such 
> feature?
> 3. Someone here resolved the problem in a different way?
>
> TIA
>
>
>


-
To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org


-
To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org



Re: FOP, hypenation. How to compile hyph pattern? my own

2010-06-18 Thread lexa2009

any idias? what i do wrong? can u write full path of obtaining pattern and
making document?

lexa2009 wrote:
> 
> 
> 
> lexa2009 wrote:
>> 
>> thx, i try.
>> C:\...\fop-0.95\build>jar -tvf fop-hyph.
>> jar
>>  0 Tue Jun 08 16:39:18 MSD 2010 META-INF/
>>349 Tue Jun 08 16:39:16 MSD 2010 META-INF/MANIFEST.MF
>>  0 Tue Jun 08 16:33:30 MSD 2010 hyph/
>>   2593 Tue Jun 08 16:33:30 MSD 2010 hyph/bu.hyp
>>  43808 Tue Jun 08 16:33:30 MSD 2010 hyph/en.hyp
>>   2593 Tue Jun 08 16:02:00 MSD 2010 hyph/su.hyp
>> 
>> here it is..
>> in lib the same result
>> C:\...\fop-0.95\lib>jar -tvf fop-hyph.ja
>> r
>>  0 Tue Jun 08 16:39:18 MSD 2010 META-INF/
>>349 Tue Jun 08 16:39:16 MSD 2010 META-INF/MANIFEST.MF
>>  0 Tue Jun 08 16:33:30 MSD 2010 hyph/
>>   2593 Tue Jun 08 16:33:30 MSD 2010 hyph/bu.hyp
>>  43808 Tue Jun 08 16:33:30 MSD 2010 hyph/en.hyp
>>   2593 Tue Jun 08 16:02:00 MSD 2010 hyph/su.hyp
>> 
>> J.Pietschmann wrote:
>>> 
>>> On 09.06.2010 22:11, lexa2009 wrote:

 ops, i am sorry. i use language="su", but do not work. i test also with
 english standart hyph pattern(en.hyph), but it also do not work. here
 was my
 mistake that i type "en" instead of "su"
>>> 
>>> You wrote
 than i use
> ant compile-hyphenation
 , so i have su.hyp in fop_dir\build\classes\hyph
 i put this file to fop_dir\hyph and use
> ant jar-hyphenation
 so i have fop-hyph.jar in fop_dir\build, i place it to fop_dir\lib and
 try
>>> 
>>> The 'ant jar-hyphenation' gets the *.hyp files from
>>> fop_dir\build\classes\hyph, if you moved the su.hyp file, the
>>> jar could be empty, check whether the hyp file is in
>>> the jar (use jar -tvf fop-hyph.jar).
>>> 
>>> J.Pietschmann
>>> 
>>> 
>>> -
>>> To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org
>>> For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org
>>> 
>>> 
>>> 
>> 
>> 
> 
> 

-- 
View this message in context: 
http://old.nabble.com/FOP%2C-hypenation.-How-to-compile-hyph-pattern--my-own-tp28340019p28922579.html
Sent from the FOP - Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org



Re: FOP, hypenation. How to compile hyph pattern? my own

2010-06-18 Thread lexa2009

any idias? what i do wrong? can u write full path of obtaining pattern and
making document?

lexa2009 wrote:
> 
> 
> 
> lexa2009 wrote:
>> 
>> thx, i try.
>> C:\...\fop-0.95\build>jar -tvf fop-hyph.
>> jar
>>  0 Tue Jun 08 16:39:18 MSD 2010 META-INF/
>>349 Tue Jun 08 16:39:16 MSD 2010 META-INF/MANIFEST.MF
>>  0 Tue Jun 08 16:33:30 MSD 2010 hyph/
>>   2593 Tue Jun 08 16:33:30 MSD 2010 hyph/bu.hyp
>>  43808 Tue Jun 08 16:33:30 MSD 2010 hyph/en.hyp
>>   2593 Tue Jun 08 16:02:00 MSD 2010 hyph/su.hyp
>> 
>> here it is..
>> in lib the same result
>> C:\...\fop-0.95\lib>jar -tvf fop-hyph.ja
>> r
>>  0 Tue Jun 08 16:39:18 MSD 2010 META-INF/
>>349 Tue Jun 08 16:39:16 MSD 2010 META-INF/MANIFEST.MF
>>  0 Tue Jun 08 16:33:30 MSD 2010 hyph/
>>   2593 Tue Jun 08 16:33:30 MSD 2010 hyph/bu.hyp
>>  43808 Tue Jun 08 16:33:30 MSD 2010 hyph/en.hyp
>>   2593 Tue Jun 08 16:02:00 MSD 2010 hyph/su.hyp
>> 
>> J.Pietschmann wrote:
>>> 
>>> On 09.06.2010 22:11, lexa2009 wrote:

 ops, i am sorry. i use language="su", but do not work. i test also with
 english standart hyph pattern(en.hyph), but it also do not work. here
 was my
 mistake that i type "en" instead of "su"
>>> 
>>> You wrote
 than i use
> ant compile-hyphenation
 , so i have su.hyp in fop_dir\build\classes\hyph
 i put this file to fop_dir\hyph and use
> ant jar-hyphenation
 so i have fop-hyph.jar in fop_dir\build, i place it to fop_dir\lib and
 try
>>> 
>>> The 'ant jar-hyphenation' gets the *.hyp files from
>>> fop_dir\build\classes\hyph, if you moved the su.hyp file, the
>>> jar could be empty, check whether the hyp file is in
>>> the jar (use jar -tvf fop-hyph.jar).
>>> 
>>> J.Pietschmann
>>> 
>>> 
>>> -
>>> To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org
>>> For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org
>>> 
>>> 
>>> 
>> 
>> 
> 
> 

-- 
View this message in context: 
http://old.nabble.com/FOP%2C-hypenation.-How-to-compile-hyph-pattern--my-own-tp28340019p28922578.html
Sent from the FOP - Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org



Re: FOP, Index and duplicate Page number

2010-06-18 Thread Pascal Sancho
Hi Giuseppe,

1. no;
2. no, but any help is welcome to help in implementation of such
feature; FOP is open source!
3 yes: using a 2 pass XSLT (no format here, but you can apply your own):


my text with an indexed word.





  




  

  



  

  
  : 
  
  

  



  

  
  , 



  

 
  


HTH,

Pascal



Le 18/06/2010 00:47, Giuseppe Briotti a écrit :
> Hi all, I need to work on Index for very large document.
>
> The Index is formatted like this:
>
> Albert ... Pag. 4, 5
> Alfred 6, 10, 11
>
> It works fine, but sometimes there are several reference on the same
> pages. The results it is not so good:
>
> Albert ... Pag. 4, 5, 5, 5
> Alfred 6, 10, 10, 10, 11, 11
>
> It seems from the FO specs that a best result like this:
>
> Albert ... Pag. 4, 5
> Alfred 6, 10, 11
>
> Can be achieved via a function equipped in XSL1.1. like
> merge-*-index-key-reference. Unfortunately this function seems not to
> be supported by FOP, as per
> http://xmlgraphics.apache.org/fop/compliance.html.
>
> So, the questions are:
>
> 1. I'm missing something?
> 2. there will be a scheduled evolution to achieve the support for such 
> feature?
> 3. Someone here resolved the problem in a different way?
>
> TIA
>
>
>   


-
To unsubscribe, e-mail: fop-users-unsubscr...@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-h...@xmlgraphics.apache.org