Re: [JPP-Devel] Slow parsing of JML files
Hi Quite strange : If I set a string attribute to "2017-07-31" on a single object and then convert the attribute to a date, everything works fine. When I import a big json with 1 million features containing a string attribute with value "2017-07-31" and then convert this attribute to a date, everything seems fine, but I can't display the table and it breaks the application (complete message hereafter) It seems that in this case, the converter failed to convert strings to date (if I check attributes after conversion, they still contain strings even if the View/Edit schema tell me it is of date type...) Michaël json : 8s (date is interpreted as string, I could not convert the string formatted as -MM-dd to a date through View/Edit schema plugins*) weird, will have a look at it. but again, a complete stack would have helped ;) Exception in thread "AWT-EventQueue-0" java.lang.IllegalArgumentException: Cannot format given Object as a Date at java.text.DateFormat.format(DateFormat.java:310) at java.text.Format.format(Format.java:157) at com.vividsolutions.jump.workbench.ui.AttributeTablePanel$MyTable$2.setValue(AttributeTablePanel.java:192) at javax.swing.table.DefaultTableCellRenderer.getTableCellRendererComponent(DefaultTableCellRenderer.java:257) at javax.swing.JTable.prepareRenderer(JTable.java:5723) at javax.swing.plaf.basic.BasicTableUI.paintCell(BasicTableUI.java:2114) at javax.swing.plaf.basic.BasicTableUI.paintCells(BasicTableUI.java:2016) at javax.swing.plaf.basic.BasicTableUI.paint(BasicTableUI.java:1812) at javax.swing.plaf.ComponentUI.update(ComponentUI.java:161) at javax.swing.JComponent.paintComponent(JComponent.java:780) at javax.swing.JComponent.paint(JComponent.java:1056) at javax.swing.JComponent.paintChildren(JComponent.java:889) at javax.swing.JComponent.paint(JComponent.java:1065) at javax.swing.JViewport.paint(JViewport.java:728) at javax.swing.JComponent.paintChildren(JComponent.java:889) at javax.swing.JComponent.paint(JComponent.java:1065) at javax.swing.JComponent.paintChildren(JComponent.java:889) at javax.swing.JComponent.paint(JComponent.java:1065) at javax.swing.JComponent.paintChildren(JComponent.java:889) at javax.swing.JComponent.paint(JComponent.java:1065) at javax.swing.JComponent.paintChildren(JComponent.java:889) at javax.swing.JComponent.paint(JComponent.java:1065) at javax.swing.JComponent.paintChildren(JComponent.java:889) at javax.swing.JComponent.paint(JComponent.java:1065) at javax.swing.JComponent.paintChildren(JComponent.java:889) at javax.swing.JComponent.paint(JComponent.java:1065) at javax.swing.JLayeredPane.paint(JLayeredPane.java:586) at javax.swing.JComponent.paintChildren(JComponent.java:889) at javax.swing.JComponent.paint(JComponent.java:1065) at javax.swing.JComponent.paintChildren(JComponent.java:889) at javax.swing.JComponent.paint(JComponent.java:1065) at javax.swing.JComponent.paintChildren(JComponent.java:889) at javax.swing.JComponent.paint(JComponent.java:1065) at javax.swing.JLayeredPane.paint(JLayeredPane.java:586) at javax.swing.JComponent.paintToOffscreen(JComponent.java:5210) at javax.swing.RepaintManager$PaintManager.paintDoubleBuffered(RepaintManager.java:1579) at javax.swing.RepaintManager$PaintManager.paint(RepaintManager.java:1502) at javax.swing.RepaintManager.paint(RepaintManager.java:1272) at javax.swing.JComponent._paintImmediately(JComponent.java:5158) at javax.swing.JComponent.paintImmediately(JComponent.java:4969) at javax.swing.RepaintManager$4.run(RepaintManager.java:831) at javax.swing.RepaintManager$4.run(RepaintManager.java:814) at java.security.AccessController.doPrivileged(Native Method) at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(ProtectionDomain.java:80) at javax.swing.RepaintManager.paintDirtyRegions(RepaintManager.java:814) at javax.swing.RepaintManager.paintDirtyRegions(RepaintManager.java:789) at javax.swing.RepaintManager.prePaintDirtyRegions(RepaintManager.java:738) at javax.swing.RepaintManager.access$1200(RepaintManager.java:64) at javax.swing.RepaintManager$ProcessingRunnable.run(RepaintManager.java:1732) at java.awt.event.InvocationEvent.dispatch(InvocationEvent.java:311) at java.awt.EventQueue.dispatchEventImpl(EventQueue.java:756) at java.awt.EventQueue.access$500(EventQueue.java:97) at java.awt.EventQueue$3.run(EventQueue.java:709) at java.awt.EventQueue$3.run(EventQueue.java:703) at java.security.AccessController.doPrivileged(Native Method) at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(ProtectionDomain.java:80) at java.awt.EventQueue.dispatchEvent(EventQueue.java:726) at java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:201) at java.awt.Eve
Re: [JPP-Devel] Slow parsing of JML files
Hi, With my data and computer loading of JML takes now 1 min 40 sec. Much better that 25 minutes as it used to be. -Jukka- -Alkuperäinen viesti- Lähettäjä: edgar.sol...@web.de [mailto:edgar.sol...@web.de] Lähetetty: 31. heinäkuuta 2017 12:11 Vastaanottaja: jump devel Aihe: Re: [JPP-Devel] Slow parsing of JML files On 31.07.2017 08:19, Michaël Michaud wrote: > Hi Ede, > > I can confirm what you say. > > On a file with 1M simple shapes + 6 string attributes + 1 date > attribute > > shp : 23s (same time with or without attributes, 19s if the file set > has no shx) wonder how shp reader deals with dates, it can't parse them via FlexibleDateParser, if there is no delay. > jml : *18s* (17s without the date and *5mn with the date and before > your optimization !!!*) Jukka's dataset had 3 date entries over 1.1Mil features, imagine the slow down > json : 8s (date is interpreted as string, I could not convert the > string formatted as -MM-dd to a date through View/Edit schema > plugins*) weird, will have a look at it. but again, a complete stack would have helped ;) ..thx ede > Michaël > > Exception in thread "AWT-EventQueue-0" java.lang.IllegalArgumentException: > Cannot format given Object as a Date > at java.text.DateFormat.format(DateFormat.java:310) > at java.text.Format.format(Format.java:157) > at > com.vividsolutions.jump.workbench.ui.AttributeTablePanel$MyTable$2.set > Value(AttributeTablePanel.java:192) > > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Jump-pilot-devel mailing list Jump-pilot-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Jump-pilot-devel mailing list Jump-pilot-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
Re: [JPP-Devel] Slow parsing of JML files
On 31.07.2017 08:19, Michaël Michaud wrote: > Hi Ede, > > I can confirm what you say. > > On a file with 1M simple shapes + 6 string attributes + 1 date attribute > > shp : 23s (same time with or without attributes, 19s if the file set has no > shx) wonder how shp reader deals with dates, it can't parse them via FlexibleDateParser, if there is no delay. > jml : *18s* (17s without the date and *5mn with the date and before your > optimization !!!*) Jukka's dataset had 3 date entries over 1.1Mil features, imagine the slow down > json : 8s (date is interpreted as string, I could not convert the string > formatted as -MM-dd to a date through View/Edit schema plugins*) weird, will have a look at it. but again, a complete stack would have helped ;) ..thx ede > Michaël > > Exception in thread "AWT-EventQueue-0" java.lang.IllegalArgumentException: > Cannot format given Object as a Date > at java.text.DateFormat.format(DateFormat.java:310) > at java.text.Format.format(Format.java:157) > at > com.vividsolutions.jump.workbench.ui.AttributeTablePanel$MyTable$2.setValue(AttributeTablePanel.java:192) > > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Jump-pilot-devel mailing list Jump-pilot-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
Re: [JPP-Devel] Slow parsing of JML files
Hi Ede, I can confirm what you say. On a file with 1M simple shapes + 6 string attributes + 1 date attribute shp : 23s (same time with or without attributes, 19s if the file set has no shx) jml : *18s* (17s without the date and *5mn with the date and before your optimization !!!*) json : 8s (date is interpreted as string, I could not convert the string formatted as -MM-dd to a date through View/Edit schema plugins*) Michaël Exception in thread "AWT-EventQueue-0" java.lang.IllegalArgumentException: Cannot format given Object as a Date at java.text.DateFormat.format(DateFormat.java:310) at java.text.Format.format(Format.java:157) at com.vividsolutions.jump.workbench.ui.AttributeTablePanel$MyTable$2.setValue(AttributeTablePanel.java:192) Le 30/07/2017 à 18:46, edgar.sol...@web.de a écrit : Jukka, sorry, r5482 is the latest. commit msg stays the same. ..ede On 7/30/2017 18:44, edgar.sol...@web.de wrote: Jukka, can you try r4682? Log Message: --- speeding up JML/GML reader when reading time attributes by parsing them lazy (during access later) - adding FlexibleFeature, FlexibleFeatureSchema - porting GMLReader to use the flexible classes ..ede On 7/29/2017 22:50, Rahkonen Jukka (MML) wrote: Hi, I took some timings with todays nightly build by opening my dataset with 1101817 polygons and quite many attributes. Opening time JML: 25 minutes 14 sec Shapefile: 1 minute 28 sec GeoJSON: 45 seconds Seems that there must be something suboptimal in the JML driver and that Ede did really nice work with the GeoJSON driver. -Jukka- Lähettäjä: edgar.sol...@web.de Lähetetty: 29. heinäkuuta 2017 22:25 Vastaanottaja: OpenJump develop and use Aihe: Re: [JPP-Devel] Slow parsing of JML files just spent some time with GMLReader, which is actually our JMLReader and according to my finding it get's some magnitudes faster when the date parsing is commented out (in GMLInputTemplate.getColumnValue()). had a short look at the FlexibleDateParser and have the impression that regex patterns are not precompiled but recompiled on every usage. btw. that would be a classic in terms of parsing slowdowns. ..ede On 7/29/2017 13:03, Michaël Michaud wrote: Hi, I also remembered that jml reader was quite slow compared with shp, but my last tests show me only slight differences. (Maybe shapefile reader has slowed down with my last commits - attempts to make it more robust with shp coming from esri or qgis) Now, reading a big file (888000 features) with complex polygons and 3 attributes is just a bit longer with jml (38 vs 32s) For a file containing 1M of simple features (squares), it is even faster with jml (14 vs 23). Reading more attributes seems longer with jml. I would say : reading complex geometries is much longer with jml reading attributes is longer with jml reading simple geometries is longer with shx/shp Jukka, have you examples where jml is much longer that shp ? Michaël Le 28/07/2017 à 17:25, edgar.sol...@web.de a écrit : On 28.07.2017 17:06, Rahkonen Jukka (MML) wrote: Hi, I noticed with a dataset having 1.2 million features with lots of attributes that OpenJUMP is rather slow in parsing its own native JML format. It takes about 30 minutes to open the file. Shapefiles are much faster but they do not suit me because long strings are truncated. Can the hardcore developers guess what is the bottle neck with JML/XML parsing and if there could be some place for improvements? hi Jukka, could you provide said dataset privately? does it make a difference when you cut down the number of attributes? ..ede -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Jump-pilot-devel mailing list Jump-pilot-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Jump-pilot-devel mailing list Jump-pilot-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Jump-pilot-devel mailing list Jump-pilot-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel -- Check out the vibrant tech community on one of the world's mos
Re: [JPP-Devel] Slow parsing of JML files
Jukka, sorry, r5482 is the latest. commit msg stays the same. ..ede On 7/30/2017 18:44, edgar.sol...@web.de wrote: > Jukka, > > can you try r4682? > >> Log Message: >> --- >> speeding up JML/GML reader when reading time attributes by parsing them lazy >> (during access later) >> - adding FlexibleFeature, FlexibleFeatureSchema >> - porting GMLReader to use the flexible classes > > ..ede > > On 7/29/2017 22:50, Rahkonen Jukka (MML) wrote: >> Hi, >> >> I took some timings with todays nightly build by opening my dataset with >> 1101817 polygons and quite many attributes. >> >> Opening time >> JML: 25 minutes 14 sec >> Shapefile: 1 minute 28 sec >> GeoJSON: 45 seconds >> >> Seems that there must be something suboptimal in the JML driver and that Ede >> did really nice work with the GeoJSON driver. >> >> -Jukka- >> >> >> ________________________ >> Lähettäjä: edgar.sol...@web.de >> Lähetetty: 29. heinäkuuta 2017 22:25 >> Vastaanottaja: OpenJump develop and use >> Aihe: Re: [JPP-Devel] Slow parsing of JML files >> >> just spent some time with GMLReader, which is actually our JMLReader and >> according to my finding it get's some magnitudes faster when the date >> parsing is commented out (in GMLInputTemplate.getColumnValue()). >> >> had a short look at the FlexibleDateParser and have the impression that >> regex patterns are not precompiled but recompiled on every usage. btw. that >> would be a classic in terms of parsing slowdowns. >> >> ..ede >> >> On 7/29/2017 13:03, Michaël Michaud wrote: >>> Hi, >>> >>> I also remembered that jml reader was quite slow compared with shp, but my >>> last tests show me only slight differences. >>> >>> (Maybe shapefile reader has slowed down with my last commits - attempts to >>> make it more robust with shp coming from esri or qgis) >>> >>> Now, reading a big file (888000 features) with complex polygons and 3 >>> attributes is just a bit longer with jml (38 vs 32s) >>> For a file containing 1M of simple features (squares), it is even faster >>> with jml (14 vs 23). Reading more attributes seems longer with jml. >>> >>> I would say : >>> >>> reading complex geometries is much longer with jml >>> reading attributes is longer with jml >>> reading simple geometries is longer with shx/shp >>> >>> Jukka, have you examples where jml is much longer that shp ? >>> >>> Michaël >>> >>> >>> Le 28/07/2017 à 17:25, edgar.sol...@web.de a écrit : >>>> On 28.07.2017 17:06, Rahkonen Jukka (MML) wrote: >>>>> Hi, >>>>> >>>>> I noticed with a dataset having 1.2 million features with lots of >>>>> attributes that OpenJUMP is rather slow in parsing its own native JML >>>>> format. It takes about 30 minutes to open the file. Shapefiles are much >>>>> faster but they do not suit me because long strings are truncated. Can >>>>> the hardcore developers guess what is the bottle neck with JML/XML >>>>> parsing and if there could be some place for improvements? >>>>> >>>> hi Jukka, >>>> >>>> could you provide said dataset privately? >>>> >>>> does it make a difference when you cut down the number of attributes? >>>> >>>> ..ede >>>> >>>> -- >>>> Check out the vibrant tech community on one of the world's most >>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >>>> ___ >>>> Jump-pilot-devel mailing list >>>> Jump-pilot-devel@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel >>>> >>> >>> >>> -- >>> Check out the vibrant tech community on one of the world's most >>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >>> ___ >>> Jump-pilot-devel mailing list >>> Jump-pilot-devel@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel >> >> >> --
Re: [JPP-Devel] Slow parsing of JML files
Jukka, can you try r4682? > Log Message: > --- > speeding up JML/GML reader when reading time attributes by parsing them lazy > (during access later) > - adding FlexibleFeature, FlexibleFeatureSchema > - porting GMLReader to use the flexible classes ..ede On 7/29/2017 22:50, Rahkonen Jukka (MML) wrote: > Hi, > > I took some timings with todays nightly build by opening my dataset with > 1101817 polygons and quite many attributes. > > Opening time > JML: 25 minutes 14 sec > Shapefile: 1 minute 28 sec > GeoJSON: 45 seconds > > Seems that there must be something suboptimal in the JML driver and that Ede > did really nice work with the GeoJSON driver. > > -Jukka- > > > > Lähettäjä: edgar.sol...@web.de > Lähetetty: 29. heinäkuuta 2017 22:25 > Vastaanottaja: OpenJump develop and use > Aihe: Re: [JPP-Devel] Slow parsing of JML files > > just spent some time with GMLReader, which is actually our JMLReader and > according to my finding it get's some magnitudes faster when the date parsing > is commented out (in GMLInputTemplate.getColumnValue()). > > had a short look at the FlexibleDateParser and have the impression that regex > patterns are not precompiled but recompiled on every usage. btw. that would > be a classic in terms of parsing slowdowns. > > ..ede > > On 7/29/2017 13:03, Michaël Michaud wrote: >> Hi, >> >> I also remembered that jml reader was quite slow compared with shp, but my >> last tests show me only slight differences. >> >> (Maybe shapefile reader has slowed down with my last commits - attempts to >> make it more robust with shp coming from esri or qgis) >> >> Now, reading a big file (888000 features) with complex polygons and 3 >> attributes is just a bit longer with jml (38 vs 32s) >> For a file containing 1M of simple features (squares), it is even faster >> with jml (14 vs 23). Reading more attributes seems longer with jml. >> >> I would say : >> >> reading complex geometries is much longer with jml >> reading attributes is longer with jml >> reading simple geometries is longer with shx/shp >> >> Jukka, have you examples where jml is much longer that shp ? >> >> Michaël >> >> >> Le 28/07/2017 à 17:25, edgar.sol...@web.de a écrit : >>> On 28.07.2017 17:06, Rahkonen Jukka (MML) wrote: >>>> Hi, >>>> >>>> I noticed with a dataset having 1.2 million features with lots of >>>> attributes that OpenJUMP is rather slow in parsing its own native JML >>>> format. It takes about 30 minutes to open the file. Shapefiles are much >>>> faster but they do not suit me because long strings are truncated. Can the >>>> hardcore developers guess what is the bottle neck with JML/XML parsing and >>>> if there could be some place for improvements? >>>> >>> hi Jukka, >>> >>> could you provide said dataset privately? >>> >>> does it make a difference when you cut down the number of attributes? >>> >>> ..ede >>> >>> -- >>> Check out the vibrant tech community on one of the world's most >>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >>> ___ >>> Jump-pilot-devel mailing list >>> Jump-pilot-devel@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel >>> >> >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> ___ >> Jump-pilot-devel mailing list >> Jump-pilot-devel@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Jump-pilot-devel mailing list > Jump-pilot-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Jump-pilot-devel mailing list > Jump-pilot-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Jump-pilot-devel mailing list Jump-pilot-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
Re: [JPP-Devel] Slow parsing of JML files
Jukka, thanks for the flowers :).. try the same w/o date attributes, that should speed up JML by magnitudes. ..ede On 7/29/2017 22:50, Rahkonen Jukka (MML) wrote: > Hi, > > I took some timings with todays nightly build by opening my dataset with > 1101817 polygons and quite many attributes. > > Opening time > JML: 25 minutes 14 sec > Shapefile: 1 minute 28 sec > GeoJSON: 45 seconds > > Seems that there must be something suboptimal in the JML driver and that Ede > did really nice work with the GeoJSON driver. > > -Jukka- > > > > Lähettäjä: edgar.sol...@web.de > Lähetetty: 29. heinäkuuta 2017 22:25 > Vastaanottaja: OpenJump develop and use > Aihe: Re: [JPP-Devel] Slow parsing of JML files > > just spent some time with GMLReader, which is actually our JMLReader and > according to my finding it get's some magnitudes faster when the date parsing > is commented out (in GMLInputTemplate.getColumnValue()). > > had a short look at the FlexibleDateParser and have the impression that regex > patterns are not precompiled but recompiled on every usage. btw. that would > be a classic in terms of parsing slowdowns. > > ..ede > > On 7/29/2017 13:03, Michaël Michaud wrote: >> Hi, >> >> I also remembered that jml reader was quite slow compared with shp, but my >> last tests show me only slight differences. >> >> (Maybe shapefile reader has slowed down with my last commits - attempts to >> make it more robust with shp coming from esri or qgis) >> >> Now, reading a big file (888000 features) with complex polygons and 3 >> attributes is just a bit longer with jml (38 vs 32s) >> For a file containing 1M of simple features (squares), it is even faster >> with jml (14 vs 23). Reading more attributes seems longer with jml. >> >> I would say : >> >> reading complex geometries is much longer with jml >> reading attributes is longer with jml >> reading simple geometries is longer with shx/shp >> >> Jukka, have you examples where jml is much longer that shp ? >> >> Michaël >> >> >> Le 28/07/2017 à 17:25, edgar.sol...@web.de a écrit : >>> On 28.07.2017 17:06, Rahkonen Jukka (MML) wrote: >>>> Hi, >>>> >>>> I noticed with a dataset having 1.2 million features with lots of >>>> attributes that OpenJUMP is rather slow in parsing its own native JML >>>> format. It takes about 30 minutes to open the file. Shapefiles are much >>>> faster but they do not suit me because long strings are truncated. Can the >>>> hardcore developers guess what is the bottle neck with JML/XML parsing and >>>> if there could be some place for improvements? >>>> >>> hi Jukka, >>> >>> could you provide said dataset privately? >>> >>> does it make a difference when you cut down the number of attributes? >>> >>> ..ede >>> >>> -- >>> Check out the vibrant tech community on one of the world's most >>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >>> ___ >>> Jump-pilot-devel mailing list >>> Jump-pilot-devel@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel >>> >> >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> ___ >> Jump-pilot-devel mailing list >> Jump-pilot-devel@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Jump-pilot-devel mailing list > Jump-pilot-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Jump-pilot-devel mailing list > Jump-pilot-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Jump-pilot-devel mailing list Jump-pilot-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
Re: [JPP-Devel] Slow parsing of JML files
Hi, I took some timings with todays nightly build by opening my dataset with 1101817 polygons and quite many attributes. Opening time JML: 25 minutes 14 sec Shapefile: 1 minute 28 sec GeoJSON: 45 seconds Seems that there must be something suboptimal in the JML driver and that Ede did really nice work with the GeoJSON driver. -Jukka- Lähettäjä: edgar.sol...@web.de Lähetetty: 29. heinäkuuta 2017 22:25 Vastaanottaja: OpenJump develop and use Aihe: Re: [JPP-Devel] Slow parsing of JML files just spent some time with GMLReader, which is actually our JMLReader and according to my finding it get's some magnitudes faster when the date parsing is commented out (in GMLInputTemplate.getColumnValue()). had a short look at the FlexibleDateParser and have the impression that regex patterns are not precompiled but recompiled on every usage. btw. that would be a classic in terms of parsing slowdowns. ..ede On 7/29/2017 13:03, Michaël Michaud wrote: > Hi, > > I also remembered that jml reader was quite slow compared with shp, but my > last tests show me only slight differences. > > (Maybe shapefile reader has slowed down with my last commits - attempts to > make it more robust with shp coming from esri or qgis) > > Now, reading a big file (888000 features) with complex polygons and 3 > attributes is just a bit longer with jml (38 vs 32s) > For a file containing 1M of simple features (squares), it is even faster with > jml (14 vs 23). Reading more attributes seems longer with jml. > > I would say : > > reading complex geometries is much longer with jml > reading attributes is longer with jml > reading simple geometries is longer with shx/shp > > Jukka, have you examples where jml is much longer that shp ? > > Michaël > > > Le 28/07/2017 à 17:25, edgar.sol...@web.de a écrit : >> On 28.07.2017 17:06, Rahkonen Jukka (MML) wrote: >>> Hi, >>> >>> I noticed with a dataset having 1.2 million features with lots of >>> attributes that OpenJUMP is rather slow in parsing its own native JML >>> format. It takes about 30 minutes to open the file. Shapefiles are much >>> faster but they do not suit me because long strings are truncated. Can the >>> hardcore developers guess what is the bottle neck with JML/XML parsing and >>> if there could be some place for improvements? >>> >> hi Jukka, >> >> could you provide said dataset privately? >> >> does it make a difference when you cut down the number of attributes? >> >> ..ede >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> ___ >> Jump-pilot-devel mailing list >> Jump-pilot-devel@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel >> > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Jump-pilot-devel mailing list > Jump-pilot-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Jump-pilot-devel mailing list Jump-pilot-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Jump-pilot-devel mailing list Jump-pilot-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
Re: [JPP-Devel] Slow parsing of JML files
just spent some time with GMLReader, which is actually our JMLReader and according to my finding it get's some magnitudes faster when the date parsing is commented out (in GMLInputTemplate.getColumnValue()). had a short look at the FlexibleDateParser and have the impression that regex patterns are not precompiled but recompiled on every usage. btw. that would be a classic in terms of parsing slowdowns. ..ede On 7/29/2017 13:03, Michaël Michaud wrote: > Hi, > > I also remembered that jml reader was quite slow compared with shp, but my > last tests show me only slight differences. > > (Maybe shapefile reader has slowed down with my last commits - attempts to > make it more robust with shp coming from esri or qgis) > > Now, reading a big file (888000 features) with complex polygons and 3 > attributes is just a bit longer with jml (38 vs 32s) > For a file containing 1M of simple features (squares), it is even faster with > jml (14 vs 23). Reading more attributes seems longer with jml. > > I would say : > > reading complex geometries is much longer with jml > reading attributes is longer with jml > reading simple geometries is longer with shx/shp > > Jukka, have you examples where jml is much longer that shp ? > > Michaël > > > Le 28/07/2017 à 17:25, edgar.sol...@web.de a écrit : >> On 28.07.2017 17:06, Rahkonen Jukka (MML) wrote: >>> Hi, >>> >>> I noticed with a dataset having 1.2 million features with lots of >>> attributes that OpenJUMP is rather slow in parsing its own native JML >>> format. It takes about 30 minutes to open the file. Shapefiles are much >>> faster but they do not suit me because long strings are truncated. Can the >>> hardcore developers guess what is the bottle neck with JML/XML parsing and >>> if there could be some place for improvements? >>> >> hi Jukka, >> >> could you provide said dataset privately? >> >> does it make a difference when you cut down the number of attributes? >> >> ..ede >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> ___ >> Jump-pilot-devel mailing list >> Jump-pilot-devel@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel >> > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Jump-pilot-devel mailing list > Jump-pilot-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Jump-pilot-devel mailing list Jump-pilot-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
Re: [JPP-Devel] Slow parsing of JML files
Hi, I also remembered that jml reader was quite slow compared with shp, but my last tests show me only slight differences. (Maybe shapefile reader has slowed down with my last commits - attempts to make it more robust with shp coming from esri or qgis) Now, reading a big file (888000 features) with complex polygons and 3 attributes is just a bit longer with jml (38 vs 32s) For a file containing 1M of simple features (squares), it is even faster with jml (14 vs 23). Reading more attributes seems longer with jml. I would say : reading complex geometries is much longer with jml reading attributes is longer with jml reading simple geometries is longer with shx/shp Jukka, have you examples where jml is much longer that shp ? Michaël Le 28/07/2017 à 17:25, edgar.sol...@web.de a écrit : On 28.07.2017 17:06, Rahkonen Jukka (MML) wrote: Hi, I noticed with a dataset having 1.2 million features with lots of attributes that OpenJUMP is rather slow in parsing its own native JML format. It takes about 30 minutes to open the file. Shapefiles are much faster but they do not suit me because long strings are truncated. Can the hardcore developers guess what is the bottle neck with JML/XML parsing and if there could be some place for improvements? hi Jukka, could you provide said dataset privately? does it make a difference when you cut down the number of attributes? ..ede -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Jump-pilot-devel mailing list Jump-pilot-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Jump-pilot-devel mailing list Jump-pilot-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
Re: [JPP-Devel] Slow parsing of JML files
On 28.07.2017 17:06, Rahkonen Jukka (MML) wrote: > Hi, > > I noticed with a dataset having 1.2 million features with lots of attributes > that OpenJUMP is rather slow in parsing its own native JML format. It takes > about 30 minutes to open the file. Shapefiles are much faster but they do not > suit me because long strings are truncated. Can the hardcore developers guess > what is the bottle neck with JML/XML parsing and if there could be some place > for improvements? > hi Jukka, could you provide said dataset privately? does it make a difference when you cut down the number of attributes? ..ede -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Jump-pilot-devel mailing list Jump-pilot-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel