Jenkins build is back to normal : POI » POI-DSL-1.17 #490

2023-03-17 Thread Apache Jenkins Server
See 



-
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org



Build failed in Jenkins: POI » POI-DSL-1.17 #489

2023-03-17 Thread Apache Jenkins Server
See 


Changes:

[PJ Fanning] [bug-66503] Add flag for Excel 4 macros in composite documents. 
Thanks to M. P. Halpin

[PJ Fanning] [bug-66532] more performant way to iterate over codepoints.

[PJ Fanning] [bug-66532] more performant way to iterate over codepoints.

[PJ Fanning] [bug-66532] more performant way to iterate over codepoints. Thanks 
to Matthias Raschhofer


--
[...truncated 276.79 KB...]
M V EI2: new org.apache.poi.hwpf.converter.TextDocumentFacade(Document) may 
expose internal representation by storing an externally mutable object into 
TextDocumentFacade.document  At TextDocumentFacade.java:[line 37]
M V EI2: new org.apache.poi.hwpf.model.PicturesTable(HWPFDocument, byte[], 
byte[]) may expose internal representation by storing an externally mutable 
object into PicturesTable._dataStream  At PicturesTable.java:[line 88]
M V EI2: new org.apache.poi.hwpf.model.PicturesTable(HWPFDocument, byte[], 
byte[]) may expose internal representation by storing an externally mutable 
object into PicturesTable._document  At PicturesTable.java:[line 87]
M V EI2: new org.apache.poi.hwpf.model.PicturesTable(HWPFDocument, byte[], 
byte[], FSPATable, OfficeArtContent) may expose internal representation by 
storing an externally mutable object into PicturesTable._mainStream  At 
PicturesTable.java:[line 80]
M V EI2: new org.apache.poi.hwpf.model.PicturesTable(HWPFDocument, byte[], 
byte[], FSPATable, OfficeArtContent) may expose internal representation by 
storing an externally mutable object into PicturesTable._dataStream  At 
PicturesTable.java:[line 79]
M V EI2: new org.apache.poi.hwpf.model.PicturesTable(HWPFDocument, byte[], 
byte[], FSPATable, OfficeArtContent) may expose internal representation by 
storing an externally mutable object into PicturesTable._document  At 
PicturesTable.java:[line 78]
M V EI2: new org.apache.poi.hwpf.model.PicturesTable(HWPFDocument, byte[], 
byte[]) may expose internal representation by storing an externally mutable 
object into PicturesTable._mainStream  At PicturesTable.java:[line 89]
M D DLS: Dead store to rgnCntBytes in 
org.apache.poi.hemf.record.emf.HemfFill.readRgnData(LittleEndianInputStream, 
List)  At HemfFill.java:[line 821]
M V EI: org.apache.poi.hwpf.usermodel.HeaderStories.getRange() may expose 
internal representation by returning HeaderStories.headerStories  At 
HeaderStories.java:[line 365]
M V EI: org.apache.poi.hpbf.HPBFDocument.getMainContents() may expose internal 
representation by returning HPBFDocument.mainContents  At 
HPBFDocument.java:[line 71]
M V EI: org.apache.poi.hpbf.HPBFDocument.getEscherDelayStm() may expose 
internal representation by returning HPBFDocument.escherDelayStm  At 
HPBFDocument.java:[line 80]
M V EI: org.apache.poi.hpbf.HPBFDocument.getQuillContents() may expose internal 
representation by returning HPBFDocument.quillContents  At 
HPBFDocument.java:[line 74]
M V EI: org.apache.poi.hpbf.HPBFDocument.getEscherStm() may expose internal 
representation by returning HPBFDocument.escherStm  At HPBFDocument.java:[line 
77]
M V EI: org.apache.poi.hwpf.converter.FoDocumentFacade.getDocument() may expose 
internal representation by returning FoDocumentFacade.document  At 
FoDocumentFacade.java:[line 235]
M V EI2: new org.apache.poi.hwpf.converter.FoDocumentFacade(Document) may 
expose internal representation by storing an externally mutable object into 
FoDocumentFacade.document  At FoDocumentFacade.java:[line 41]
M V EI: org.apache.poi.hemf.usermodel.HemfPicture.getRecords() may expose 
internal representation by returning HemfPicture.records  At 
HemfPicture.java:[line 106]
M V EI: org.apache.poi.hwmf.record.HwmfMisc$WmfCreatePenIndirect.getDimension() 
may expose internal representation by returning 
HwmfMisc$WmfCreatePenIndirect.dimension  At HwmfMisc.java:[line 735]
M V EI: org.apache.poi.hwmf.record.HwmfMisc$WmfCreatePenIndirect.getColorRef() 
may expose internal representation by returning 
HwmfMisc$WmfCreatePenIndirect.colorRef  At HwmfMisc.java:[line 739]
M V EI: org.apache.poi.hwmf.record.HwmfDraw$WmfPolygon.getPoly() may expose 
internal representation by returning HwmfDraw$WmfPolygon.poly  At 
HwmfDraw.java:[line 190]
M V EI: org.apache.poi.hdgf.streams.Stream.getPointer() may expose internal 
representation by returning Stream.pointer  At Stream.java:[line 38]
M V EI2: new org.apache.poi.hpbf.extractor.PublisherTextExtractor(HPBFDocument) 
may expose internal representation by storing an externally mutable object into 
PublisherTextExtractor.doc  At PublisherTextExtractor.java:[line 40]
M V EI: org.apache.poi.hpbf.extractor.PublisherTextExtractor.getFilesystem() 
may expose internal representation by returning PublisherTextExtractor.doc  At 
PublisherTextExtractor.java:[line 118]
M V EI: org.apache.poi.hpbf.extractor.PublisherTextExtractor.getDocument() may 
expose internal representation by 

Build failed in Jenkins: POI » POI-DSL-Windows-1.8 #738

2023-03-17 Thread Apache Jenkins Server
See 


Changes:

[PJ Fanning] [bug-66503] Add flag for Excel 4 macros in composite documents. 
Thanks to M. P. Halpin

[PJ Fanning] [bug-66532] more performant way to iterate over codepoints.

[PJ Fanning] [bug-66532] more performant way to iterate over codepoints.

[PJ Fanning] [bug-66532] more performant way to iterate over codepoints. Thanks 
to Matthias Raschhofer


--
[...truncated 251.53 KB...]
M V EI: org.apache.poi.hpbf.HPBFDocument.getEscherStm() may expose internal 
representation by returning HPBFDocument.escherStm  At HPBFDocument.java:[line 
77]
M V EI: org.apache.poi.hwpf.converter.FoDocumentFacade.getDocument() may expose 
internal representation by returning FoDocumentFacade.document  At 
FoDocumentFacade.java:[line 235]
M V EI2: new org.apache.poi.hwpf.converter.FoDocumentFacade(Document) may 
expose internal representation by storing an externally mutable object into 
FoDocumentFacade.document  At FoDocumentFacade.java:[line 41]
M V EI: org.apache.poi.hemf.usermodel.HemfPicture.getRecords() may expose 
internal representation by returning HemfPicture.records  At 
HemfPicture.java:[line 106]
M V EI: org.apache.poi.hwmf.record.HwmfMisc$WmfCreatePenIndirect.getDimension() 
may expose internal representation by returning 
HwmfMisc$WmfCreatePenIndirect.dimension  At HwmfMisc.java:[line 735]
M V EI: org.apache.poi.hwmf.record.HwmfMisc$WmfCreatePenIndirect.getColorRef() 
may expose internal representation by returning 
HwmfMisc$WmfCreatePenIndirect.colorRef  At HwmfMisc.java:[line 739]
M V EI: org.apache.poi.hwmf.record.HwmfDraw$WmfPolygon.getPoly() may expose 
internal representation by returning HwmfDraw$WmfPolygon.poly  At 
HwmfDraw.java:[line 190]
M V EI: org.apache.poi.hdgf.streams.Stream.getPointer() may expose internal 
representation by returning Stream.pointer  At Stream.java:[line 38]
M V EI2: new org.apache.poi.hpbf.extractor.PublisherTextExtractor(HPBFDocument) 
may expose internal representation by storing an externally mutable object into 
PublisherTextExtractor.doc  At PublisherTextExtractor.java:[line 40]
M V EI: org.apache.poi.hpbf.extractor.PublisherTextExtractor.getFilesystem() 
may expose internal representation by returning PublisherTextExtractor.doc  At 
PublisherTextExtractor.java:[line 118]
M V EI: org.apache.poi.hpbf.extractor.PublisherTextExtractor.getDocument() may 
expose internal representation by returning PublisherTextExtractor.doc  At 
PublisherTextExtractor.java:[line 103]
M V EI: 
org.apache.poi.hemf.record.emfplus.HemfPlusImage$EmfPlusImage.getImageData() 
may expose internal representation by returning 
HemfPlusImage$EmfPlusImage.imageData  At HemfPlusImage.java:[line 294]
M D SF: Switch statement found in 
org.apache.poi.hemf.record.emfplus.HemfPlusImage$EmfPlusImage.getContentType(byte[])
 where default case is missing  At HemfPlusImage.java:[lines 517-531]
M D REC: Exception is caught when Exception is not thrown in 
org.apache.poi.hemf.record.emfplus.HemfPlusImage$EmfPlusImage.getBounds(List)  
At HemfPlusImage.java:[line 445]
M V EI: 
org.apache.poi.hemf.record.emf.HemfComment$EmfCommentDataWMF.getWMFData() may 
expose internal representation by returning 
HemfComment$EmfCommentDataWMF.wmfData  At HemfComment.java:[line 613]
M V EI: 
org.apache.poi.hemf.record.emf.HemfComment$EmfCommentDataWMF.getBounds() may 
expose internal representation by returning 
HemfComment$EmfCommentDataWMF.bounds  At HemfComment.java:[line 617]
M D DLS: Dead store to version in 
org.apache.poi.hemf.record.emf.HemfComment$EmfCommentDataWMF.init(LittleEndianInputStream,
 long)  At HemfComment.java:[line 587]
M D DLS: Dead store to checksum in 
org.apache.poi.hemf.record.emf.HemfComment$EmfCommentDataWMF.init(LittleEndianInputStream,
 long)  At HemfComment.java:[line 593]
M D DLS: Dead store to flags in 
org.apache.poi.hemf.record.emf.HemfComment$EmfCommentDataWMF.init(LittleEndianInputStream,
 long)  At HemfComment.java:[line 596]
M V EI2: new 
org.apache.poi.hwpf.converter.WordToTextConverter(TextDocumentFacade) may 
expose internal representation by storing an externally mutable object into 
WordToTextConverter.textDocumentFacade  At WordToTextConverter.java:[line 158]
M V EI2: new org.apache.poi.hwpf.extractor.WordExtractor(HWPFDocument) may 
expose internal representation by storing an externally mutable object into 
WordExtractor.doc  At WordExtractor.java:[line 74]
M V EI: org.apache.poi.hwpf.extractor.WordExtractor.getFilesystem() may expose 
internal representation by returning WordExtractor.doc  At 
WordExtractor.java:[line 290]
M V EI: org.apache.poi.hwpf.extractor.WordExtractor.getDocument() may expose 
internal representation by returning WordExtractor.doc  At 
WordExtractor.java:[line 275]
M D DLS: Dead store to signature in 
org.apache.poi.hemf.record.emf.HemfFont.init(LittleEndianInputStream, long)  At 
HemfFont.java:[line 457]
M 

[GitHub] [poi] pjfanning commented on pull request #436: Add flag for Excel 4 macros in composite documents

2023-03-17 Thread via GitHub


pjfanning commented on PR #436:
URL: https://github.com/apache/poi/pull/436#issuecomment-1474554346

   I added this but then reverted it due to 
https://github.com/apache/poi/actions/runs/4452719179/jobs/7820620683
   
   Causes issues like this:
   
   ```
   Caused by: java.lang.IllegalStateException: bad function index (54, false)
at 
org.apache.poi.ss.formula.ptg.AbstractFunctionPtg.lookupName(AbstractFunctionPtg.java:143)
at 
org.apache.poi.ss.formula.ptg.FuncVarPtg.lookupName(FuncVarPtg.java:82)
at 
org.apache.poi.ss.formula.ptg.AbstractFunctionPtg.getName(AbstractFunctionPtg.java:72)
at 
org.apache.poi.ss.formula.ptg.AbstractFunctionPtg.toFormulaString(AbstractFunctionPtg.java:95)
at 
org.apache.poi.ss.formula.FormulaRenderer.toFormulaString(FormulaRenderer.java:99)
at 
org.apache.poi.hssf.model.HSSFFormulaParser.toFormulaString(HSSFFormulaParser.java:89)
at 
org.apache.poi.hssf.usermodel.HSSFCell.getCellFormula(HSSFCell.java:643)
at org.apache.poi.hssf.usermodel.HSSFCell.toString(HSSFCell.java:1045)
at 
org.apache.poi.stress.SpreadsheetHandler.readContent(SpreadsheetHandler.java:84)
at 
org.apache.poi.stress.SpreadsheetHandler.handleWorkbook(SpreadsheetHandler.java:38)
at 
org.apache.poi.stress.HSSFFileHandler.handleFile(HSSFFileHandler.java:43)
at 
org.apache.poi.stress.TestAllFiles.lambda$handleFile$1(TestAllFiles.java:209)
at 
org.junit.jupiter.api.AssertDoesNotThrow.assertDoesNotThrow(AssertDoesNotThrow.java:49)
... 41 more
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org



[GitHub] [poi] pjfanning commented on pull request #443: [Bug-66532] Improve performance of SheetDataWriter

2023-03-17 Thread via GitHub


pjfanning commented on PR #443:
URL: https://github.com/apache/poi/pull/443#issuecomment-1474516597

   I added 
https://github.com/apache/poi/commit/0275daa5deae2e0069badd1f46268abb43fbc3dc - 
not exactly what is in your PR. Could you try this out in your benchmark?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org



[GitHub] [poi] pjfanning commented on pull request #443: [Bug-66532] Improve performance of SheetDataWriter

2023-03-17 Thread via GitHub


pjfanning commented on PR #443:
URL: https://github.com/apache/poi/pull/443#issuecomment-1474422688

   thanks - looks good. Let me look over this but it looks like this should be 
safe to merge over the coming days.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org



[GitHub] [poi] rascmatt commented on pull request #443: [Bug-66532] Improve performance of SheetDataWriter

2023-03-17 Thread via GitHub


rascmatt commented on PR #443:
URL: https://github.com/apache/poi/pull/443#issuecomment-1474419787

   https://github.com/rascmatt/poi-benchmark
   The (updated) benchmark with code and results.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org



[GitHub] [poi] rascmatt commented on a diff in pull request #443: [Bug-66532] Improve performance of SheetDataWriter

2023-03-17 Thread via GitHub


rascmatt commented on code in PR #443:
URL: https://github.com/apache/poi/pull/443#discussion_r1140725467


##
poi-ooxml/src/main/java/org/apache/poi/xssf/streaming/SheetDataWriter.java:
##
@@ -397,46 +397,51 @@ protected void outputEscapedString(String s) throws 
IOException {
 return;
 }
 
-for (Iterator iter = CodepointsUtil.iteratorFor(s); 
iter.hasNext(); ) {
-String codepoint = iter.next();
+for (int i = 0; i < s.length(); i++) {
+final int codepoint = s.codePointAt(i);
+
 switch (codepoint) {
-case "<":
+case 60: // <

Review Comment:
   Good idea. Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org



[GitHub] [poi] rascmatt commented on a diff in pull request #443: [Bug-66532] Improve performance of SheetDataWriter

2023-03-17 Thread via GitHub


rascmatt commented on code in PR #443:
URL: https://github.com/apache/poi/pull/443#discussion_r1140725183


##
poi-ooxml/src/main/java/org/apache/poi/xssf/streaming/SheetDataWriter.java:
##
@@ -397,46 +397,51 @@ protected void outputEscapedString(String s) throws 
IOException {
 return;
 }
 
-for (Iterator iter = CodepointsUtil.iteratorFor(s); 
iter.hasNext(); ) {
-String codepoint = iter.next();
+for (int i = 0; i < s.length(); i++) {
+final int codepoint = s.codePointAt(i);

Review Comment:
   Yes, but then the codePointAt() method will not receive the correct index. I 
switched to using the codePoints iterator, which does not seem to have a big 
performance impact.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org



[GitHub] [poi] pjfanning commented on a diff in pull request #443: [Bug-66532] Improve performance of SheetDataWriter

2023-03-17 Thread via GitHub


pjfanning commented on code in PR #443:
URL: https://github.com/apache/poi/pull/443#discussion_r1140652528


##
poi-ooxml/src/main/java/org/apache/poi/xssf/streaming/SheetDataWriter.java:
##
@@ -397,46 +397,51 @@ protected void outputEscapedString(String s) throws 
IOException {
 return;
 }
 
-for (Iterator iter = CodepointsUtil.iteratorFor(s); 
iter.hasNext(); ) {
-String codepoint = iter.next();
+for (int i = 0; i < s.length(); i++) {
+final int codepoint = s.codePointAt(i);

Review Comment:
   loops should not increment in multiple places (ideally) - if you use the 
codePointCount for the limit of the loop then the count will be correct



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org



[GitHub] [poi] pjfanning commented on pull request #443: [Bug-66532] Improve performance of SheetDataWriter

2023-03-17 Thread via GitHub


pjfanning commented on PR #443:
URL: https://github.com/apache/poi/pull/443#issuecomment-1474345524

   > Following you will find the results of a JMH benchmark I ran with 3 
different versions. Unfortunately I was unable to get JMH running with Gradle, 
so I ran this in an external Maven Project.
   > 
   > The latest released version (5.2.3): 'Main.benchOriginal' An unreleased 
improvement (#405): 'Main.bench_405' And the version proposed in this commit: 
'Main.bench_66532'
   > 
   > Benchmark Mode Cnt Score Error Units Main.benchOriginal thrpt 15 
520433.760 ± 81525.743 ops/s Main.bench_405 thrpt 15 579381.912 ± 30395.196 
ops/s Main.bench_66532 thrpt 15 3029062.360 ± 141908.012 ops/s
   
   could you share your benchmark code? a GitHub project or some such


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org



[GitHub] [poi] pjfanning commented on a diff in pull request #443: [Bug-66532] Improve performance of SheetDataWriter

2023-03-17 Thread via GitHub


pjfanning commented on code in PR #443:
URL: https://github.com/apache/poi/pull/443#discussion_r1140653374


##
poi-ooxml/src/main/java/org/apache/poi/xssf/streaming/SheetDataWriter.java:
##
@@ -397,46 +397,51 @@ protected void outputEscapedString(String s) throws 
IOException {
 return;
 }
 
-for (Iterator iter = CodepointsUtil.iteratorFor(s); 
iter.hasNext(); ) {
-String codepoint = iter.next();
+for (int i = 0; i < s.length(); i++) {
+final int codepoint = s.codePointAt(i);
+
 switch (codepoint) {
-case "<":
+case 60: // <

Review Comment:
   could you test with `'<'`? - this should have similar performance but would 
be more readable



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org



[GitHub] [poi] pjfanning commented on a diff in pull request #443: [Bug-66532] Improve performance of SheetDataWriter

2023-03-17 Thread via GitHub


pjfanning commented on code in PR #443:
URL: https://github.com/apache/poi/pull/443#discussion_r1140652528


##
poi-ooxml/src/main/java/org/apache/poi/xssf/streaming/SheetDataWriter.java:
##
@@ -397,46 +397,51 @@ protected void outputEscapedString(String s) throws 
IOException {
 return;
 }
 
-for (Iterator iter = CodepointsUtil.iteratorFor(s); 
iter.hasNext(); ) {
-String codepoint = iter.next();
+for (int i = 0; i < s.length(); i++) {
+final int codepoint = s.codePointAt(i);

Review Comment:
   loops should not increment in multiple places (ideally) - if you use the 
codePintCount for the limit of the loop then the count will be correct



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org



[GitHub] [poi] rascmatt commented on a diff in pull request #443: [Bug-66532] Improve performance of SheetDataWriter

2023-03-17 Thread via GitHub


rascmatt commented on code in PR #443:
URL: https://github.com/apache/poi/pull/443#discussion_r1140650040


##
poi-ooxml/src/main/java/org/apache/poi/xssf/streaming/SheetDataWriter.java:
##
@@ -397,46 +397,51 @@ protected void outputEscapedString(String s) throws 
IOException {
 return;
 }
 
-for (Iterator iter = CodepointsUtil.iteratorFor(s); 
iter.hasNext(); ) {
-String codepoint = iter.next();
+for (int i = 0; i < s.length(); i++) {
+final int codepoint = s.codePointAt(i);

Review Comment:
   I might need to improve the readability of the loop, however I think this is 
covered by line 441, where (in case of a character pair) the counter is 
additionally increased.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org



[GitHub] [poi] pjfanning commented on a diff in pull request #443: [Bug-66532] Improve performance of SheetDataWriter

2023-03-17 Thread via GitHub


pjfanning commented on code in PR #443:
URL: https://github.com/apache/poi/pull/443#discussion_r1140647602


##
poi-ooxml/src/main/java/org/apache/poi/xssf/streaming/SheetDataWriter.java:
##
@@ -397,46 +397,51 @@ protected void outputEscapedString(String s) throws 
IOException {
 return;
 }
 
-for (Iterator iter = CodepointsUtil.iteratorFor(s); 
iter.hasNext(); ) {
-String codepoint = iter.next();
+for (int i = 0; i < s.length(); i++) {
+final int codepoint = s.codePointAt(i);

Review Comment:
   this code is wrong - s.length() is not the number of codepoints - it is the 
number of chars - in a string.
   
   see https://www.w3resource.com/java-tutorial/string/string_codepointcount.php



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org



[Bug 66532] [PATCH] Improve performance of SheetDataWriter

2023-03-17 Thread bugzilla
https://bz.apache.org/bugzilla/show_bug.cgi?id=66532

--- Comment #4 from Matthias Raschhofer  ---
Thank you for your comments. I'm happy to use git, however I thought it was
readonly and I should provide patches here. I opened a pull request in git
(https://github.com/apache/poi/pull/443).

I attached a benchmark also comparing the performance against the change made
with https://github.com/apache/poi/pull/405. I think the difference is
significant.

I don't think there is an issue with the number of chars vs. the number of
codepoints, since the loop counter is increased in case the codepoint is in
fact a pair of characters. There are unit tests in the TestSheetDataWriter
asserting the correct behaviour for unicode surrogates as well as the
'replaceWithQuestionMark' behaviour.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org



[Bug 66532] [PATCH] Improve performance of SheetDataWriter

2023-03-17 Thread bugzilla
https://bz.apache.org/bugzilla/show_bug.cgi?id=66532

--- Comment #3 from Matthias Raschhofer  ---
Created attachment 38525
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=38525=edit
Benchmark

The attached file contains a benchmark comparing the performance of the
proposed patch against previous iterations of the same method.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org



[GitHub] [poi] rascmatt commented on pull request #443: [Bug-66532] Improve performance of SheetDataWriter

2023-03-17 Thread via GitHub


rascmatt commented on PR #443:
URL: https://github.com/apache/poi/pull/443#issuecomment-1474314063

   The following is a JMH benchmark I ran with 3 different versions. 
Unfortunately I was unable to get JMH running with Gradle, so I ran this in an 
external Maven Project.

   The latest released version (5.2.3): 'Main.benchOriginal'
   An unreleased improvement (#405): 'Main.bench_405'
   And the version proposed in this commit: 'Main.bench_66532'
   
   # JMH version: 1.36
   # VM version: JDK 11.0.18, OpenJDK 64-Bit Server VM, 
11.0.18+10-post-Ubuntu-0ubuntu120.04.1
   # VM invoker: /usr/lib/jvm/java-11-openjdk-amd64/bin/java
   # VM options: -Dfile.encoding=UTF-8
   # Blackhole mode: full + dont-inline hint (auto-detected, use 
-Djmh.blackhole.autoDetect=false to disable)
   # Warmup: 5 iterations, 10 s each
   # Measurement: 5 iterations, 10 s each
   # Timeout: 10 min per iteration
   # Threads: 1 thread, will synchronize iterations
   # Benchmark mode: Throughput, ops/time
   # Benchmark: org.apache.poi.xssf.streaming.Main.benchOriginal
   
   # Run progress: 0.00% complete, ETA 00:15:00
   # Fork: 1 of 3
   # Warmup Iteration   1: 522589.079 ops/s
   # Warmup Iteration   2: 513106.521 ops/s
   # Warmup Iteration   3: 612154.159 ops/s
   # Warmup Iteration   4: 619761.849 ops/s
   # Warmup Iteration   5: 621498.363 ops/s
   Iteration   1: 600088.567 ops/s
   Iteration   2: 611062.556 ops/s
   Iteration   3: 618327.900 ops/s
   Iteration   4: 619993.561 ops/s
   Iteration   5: 621859.279 ops/s
   
   # Run progress: 11.11% complete, ETA 00:17:06
   # Fork: 2 of 3
   # Warmup Iteration   1: 478246.706 ops/s
   # Warmup Iteration   2: 449997.179 ops/s
   # Warmup Iteration   3: 514069.045 ops/s
   # Warmup Iteration   4: 512299.229 ops/s
   # Warmup Iteration   5: 514957.173 ops/s
   Iteration   1: 510457.146 ops/s
   Iteration   2: 514257.723 ops/s
   Iteration   3: 509737.514 ops/s
   Iteration   4: 513989.051 ops/s
   Iteration   5: 512948.565 ops/s
   
   # Run progress: 22.22% complete, ETA 00:14:48
   # Fork: 3 of 3
   # Warmup Iteration   1: 411691.418 ops/s
   # Warmup Iteration   2: 403264.994 ops/s
   # Warmup Iteration   3: 436642.233 ops/s
   # Warmup Iteration   4: 435978.425 ops/s
   # Warmup Iteration   5: 437983.975 ops/s
   Iteration   1: 430823.470 ops/s
   Iteration   2: 434853.347 ops/s
   Iteration   3: 435892.963 ops/s
   Iteration   4: 434907.188 ops/s
   Iteration   5: 437307.567 ops/s
   
   
   Result "org.apache.poi.xssf.streaming.Main.benchOriginal":
 520433.760 ±(99.9%) 81525.743 ops/s [Average]
 (min, avg, max) = (430823.470, 520433.760, 621859.279), stdev = 76259.231
 CI (99.9%): [438908.017, 601959.503] (assumes normal distribution)
   
   
   # JMH version: 1.36
   # VM version: JDK 11.0.18, OpenJDK 64-Bit Server VM, 
11.0.18+10-post-Ubuntu-0ubuntu120.04.1
   # VM invoker: /usr/lib/jvm/java-11-openjdk-amd64/bin/java
   # VM options: -Dfile.encoding=UTF-8
   # Blackhole mode: full + dont-inline hint (auto-detected, use 
-Djmh.blackhole.autoDetect=false to disable)
   # Warmup: 5 iterations, 10 s each
   # Measurement: 5 iterations, 10 s each
   # Timeout: 10 min per iteration
   # Threads: 1 thread, will synchronize iterations
   # Benchmark mode: Throughput, ops/time
   # Benchmark: org.apache.poi.xssf.streaming.Main.bench_405
   
   # Run progress: 33.33% complete, ETA 00:12:32
   # Fork: 1 of 3
   # Warmup Iteration   1: 543506.534 ops/s
   # Warmup Iteration   2: 511470.709 ops/s
   # Warmup Iteration   3: 567795.940 ops/s
   # Warmup Iteration   4: 566566.365 ops/s
   # Warmup Iteration   5: 564995.160 ops/s
   Iteration   1: 566863.936 ops/s
   Iteration   2: 566574.215 ops/s
   Iteration   3: 566127.761 ops/s
   Iteration   4: 564921.292 ops/s
   Iteration   5: 564855.313 ops/s
   
   # Run progress: 44.44% complete, ETA 00:10:30
   # Fork: 2 of 3
   # Warmup Iteration   1: 557404.856 ops/s
   # Warmup Iteration   2: 518870.803 ops/s
   # Warmup Iteration   3: 580301.043 ops/s
   # Warmup Iteration   4: 580045.771 ops/s
   # Warmup Iteration   5: 579856.459 ops/s
   Iteration   1: 581015.650 ops/s
   Iteration   2: 578231.905 ops/s
   Iteration   3: 580157.145 ops/s
   Iteration   4: 513392.675 ops/s
   Iteration   5: 550622.830 ops/s
   
   # Run progress: 55.56% complete, ETA 00:08:26
   # Fork: 3 of 3
   # Warmup Iteration   1: 566280.331 ops/s
   # Warmup Iteration   2: 515131.671 ops/s
   # Warmup Iteration   3: 614036.961 ops/s
   # Warmup Iteration   4: 611286.973 ops/s
   # Warmup Iteration   5: 611678.647 ops/s
   Iteration   1: 613856.900 ops/s
   Iteration   2: 611081.324 ops/s
   Iteration   3: 613608.293 ops/s
   Iteration   4: 607487.350 ops/s
   Iteration   5: 611932.087 ops/s
   
   
   Result "org.apache.poi.xssf.streaming.Main.bench_405":
 579381.912 ±(99.9%) 30395.196 ops/s [Average]
 (min, avg, max) = (513392.675, 579381.912, 613856.900), stdev = 28431.685
 CI (99.9%): [548986.716, 609777.108] (assumes normal 

[GitHub] [poi] rascmatt opened a new pull request, #443: [Bug-66532] Improve performance of SheetDataWriter

2023-03-17 Thread via GitHub


rascmatt opened a new pull request, #443:
URL: https://github.com/apache/poi/pull/443

   Simplify loop and avoid code point to string conversions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org



[Bug 66532] [PATCH] Improve performance of SheetDataWriter

2023-03-17 Thread bugzilla
https://bz.apache.org/bugzilla/show_bug.cgi?id=66532

--- Comment #2 from PJ Fanning  ---
https://github.com/apache/poi/pull/405 is an unreleased perf change that may
improve perf time of existing code.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org



[Bug 66532] [PATCH] Improve performance of SheetDataWriter

2023-03-17 Thread bugzilla
https://bz.apache.org/bugzilla/show_bug.cgi?id=66532

--- Comment #1 from PJ Fanning  ---
It's hard to read this patch on my phone but a quick look makes me think there
is a bug with use of string length - the number of chars as opposed the number
of codepoints. I'm very reluctant to take this change. Can you provide a jmh
benchmark to back up your claims? If the codepoint code is super slow, we could
consider an option where users configure whether they want char iteration or
codepoint iteration.

Is there any chance of using GitHub instead of patch files?

-- 
You are receiving this mail because:
You are the assignee for the bug.
-
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org



[Bug 66532] New: [PATCH] Improve performance of SheetDataWriter

2023-03-17 Thread bugzilla
https://bz.apache.org/bugzilla/show_bug.cgi?id=66532

Bug ID: 66532
   Summary: [PATCH] Improve performance of SheetDataWriter
   Product: POI
   Version: unspecified
  Hardware: All
OS: All
Status: NEW
  Severity: enhancement
  Priority: P2
 Component: XSSF
  Assignee: dev@poi.apache.org
  Reporter: matthias.raschho...@gmail.com
  Target Milestone: ---

Created attachment 38524
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=38524=edit
Patchset of SheetDataWriter

Performance tests showed that creating rows in a SXSSF sheet with lots of data
spend a substantial amount of cpu time escaping the strings to write using
#outputEscapedString. 

By simplifying the loop and avoiding to convert between string and codepoints a
bunch of times, we can improve the writing speed by a good amount.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org