Re: [JPP-Devel] Slow parsing of JML files

2017-07-31 Thread Michaël Michaud

Hi

Quite strange : If I set a string attribute to "2017-07-31" on a single 
object and then convert the attribute to a date, everything works fine.


When I import a big json with 1 million features containing a string 
attribute with value "2017-07-31" and then convert this attribute to a date,
everything seems fine, but I can't display the table and it breaks the 
application (complete message hereafter)


It seems that in this case, the converter failed to convert strings to 
date (if I check attributes after conversion, they still contain strings

even if the View/Edit schema tell me it is of date type...)

Michaël
  

json : 8s (date is interpreted as string, I could not convert the string 
formatted as -MM-dd to a date through View/Edit schema plugins*)

weird, will have a look at it. but again, a complete stack would have helped ;)
Exception in thread "AWT-EventQueue-0" 
java.lang.IllegalArgumentException: Cannot format given Object as a Date

at java.text.DateFormat.format(DateFormat.java:310)
at java.text.Format.format(Format.java:157)
at 
com.vividsolutions.jump.workbench.ui.AttributeTablePanel$MyTable$2.setValue(AttributeTablePanel.java:192)
at 
javax.swing.table.DefaultTableCellRenderer.getTableCellRendererComponent(DefaultTableCellRenderer.java:257)

at javax.swing.JTable.prepareRenderer(JTable.java:5723)
at 
javax.swing.plaf.basic.BasicTableUI.paintCell(BasicTableUI.java:2114)
at 
javax.swing.plaf.basic.BasicTableUI.paintCells(BasicTableUI.java:2016)

at javax.swing.plaf.basic.BasicTableUI.paint(BasicTableUI.java:1812)
at javax.swing.plaf.ComponentUI.update(ComponentUI.java:161)
at javax.swing.JComponent.paintComponent(JComponent.java:780)
at javax.swing.JComponent.paint(JComponent.java:1056)
at javax.swing.JComponent.paintChildren(JComponent.java:889)
at javax.swing.JComponent.paint(JComponent.java:1065)
at javax.swing.JViewport.paint(JViewport.java:728)
at javax.swing.JComponent.paintChildren(JComponent.java:889)
at javax.swing.JComponent.paint(JComponent.java:1065)
at javax.swing.JComponent.paintChildren(JComponent.java:889)
at javax.swing.JComponent.paint(JComponent.java:1065)
at javax.swing.JComponent.paintChildren(JComponent.java:889)
at javax.swing.JComponent.paint(JComponent.java:1065)
at javax.swing.JComponent.paintChildren(JComponent.java:889)
at javax.swing.JComponent.paint(JComponent.java:1065)
at javax.swing.JComponent.paintChildren(JComponent.java:889)
at javax.swing.JComponent.paint(JComponent.java:1065)
at javax.swing.JComponent.paintChildren(JComponent.java:889)
at javax.swing.JComponent.paint(JComponent.java:1065)
at javax.swing.JLayeredPane.paint(JLayeredPane.java:586)
at javax.swing.JComponent.paintChildren(JComponent.java:889)
at javax.swing.JComponent.paint(JComponent.java:1065)
at javax.swing.JComponent.paintChildren(JComponent.java:889)
at javax.swing.JComponent.paint(JComponent.java:1065)
at javax.swing.JComponent.paintChildren(JComponent.java:889)
at javax.swing.JComponent.paint(JComponent.java:1065)
at javax.swing.JLayeredPane.paint(JLayeredPane.java:586)
at javax.swing.JComponent.paintToOffscreen(JComponent.java:5210)
at 
javax.swing.RepaintManager$PaintManager.paintDoubleBuffered(RepaintManager.java:1579)
at 
javax.swing.RepaintManager$PaintManager.paint(RepaintManager.java:1502)

at javax.swing.RepaintManager.paint(RepaintManager.java:1272)
at javax.swing.JComponent._paintImmediately(JComponent.java:5158)
at javax.swing.JComponent.paintImmediately(JComponent.java:4969)
at javax.swing.RepaintManager$4.run(RepaintManager.java:831)
at javax.swing.RepaintManager$4.run(RepaintManager.java:814)
at java.security.AccessController.doPrivileged(Native Method)
at 
java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(ProtectionDomain.java:80)
at 
javax.swing.RepaintManager.paintDirtyRegions(RepaintManager.java:814)
at 
javax.swing.RepaintManager.paintDirtyRegions(RepaintManager.java:789)
at 
javax.swing.RepaintManager.prePaintDirtyRegions(RepaintManager.java:738)

at javax.swing.RepaintManager.access$1200(RepaintManager.java:64)
at 
javax.swing.RepaintManager$ProcessingRunnable.run(RepaintManager.java:1732)

at java.awt.event.InvocationEvent.dispatch(InvocationEvent.java:311)
at java.awt.EventQueue.dispatchEventImpl(EventQueue.java:756)
at java.awt.EventQueue.access$500(EventQueue.java:97)
at java.awt.EventQueue$3.run(EventQueue.java:709)
at java.awt.EventQueue$3.run(EventQueue.java:703)
at java.security.AccessController.doPrivileged(Native Method)
at 
java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(ProtectionDomain.java:80)

at java.awt.EventQueue.dispatchEvent(EventQueue.java:726)
at 
java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:201)
at 
java.awt.Eve

Re: [JPP-Devel] Slow parsing of JML files

2017-07-31 Thread Rahkonen Jukka (MML)
Hi,

With my data and computer loading of JML takes now 1 min 40 sec. Much better 
that 25 minutes as it used to be.

-Jukka-

-Alkuperäinen viesti-
Lähettäjä: edgar.sol...@web.de [mailto:edgar.sol...@web.de] 
Lähetetty: 31. heinäkuuta 2017 12:11
Vastaanottaja: jump devel 
Aihe: Re: [JPP-Devel] Slow parsing of JML files

On 31.07.2017 08:19, Michaël Michaud wrote:
> Hi Ede,
> 
> I can confirm what you say.
> 
> On a file with 1M simple shapes + 6 string attributes + 1 date 
> attribute
> 
> shp : 23s (same time with or without attributes, 19s if the file set 
> has no shx)

wonder how shp reader deals with dates, it can't parse them via 
FlexibleDateParser, if there is no delay.

> jml : *18s* (17s without the date and *5mn with the date and before 
> your optimization !!!*)

Jukka's dataset had 3 date entries over 1.1Mil features, imagine the slow down
 
> json : 8s (date is interpreted as string, I could not convert the 
> string formatted as -MM-dd to a date through View/Edit schema 
> plugins*)

weird, will have a look at it. but again, a complete stack would have helped ;)

..thx ede

> Michaël
> 
> Exception in thread "AWT-EventQueue-0" java.lang.IllegalArgumentException: 
> Cannot format given Object as a Date
> at java.text.DateFormat.format(DateFormat.java:310)
> at java.text.Format.format(Format.java:157)
> at 
> com.vividsolutions.jump.workbench.ui.AttributeTablePanel$MyTable$2.set
> Value(AttributeTablePanel.java:192)
> 
> 
> 

--
Check out the vibrant tech community on one of the world's most engaging tech 
sites, Slashdot.org! http://sdm.link/slashdot 
___
Jump-pilot-devel mailing list
Jump-pilot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Jump-pilot-devel mailing list
Jump-pilot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel


Re: [JPP-Devel] Slow parsing of JML files

2017-07-31 Thread edgar . soldin
On 31.07.2017 08:19, Michaël Michaud wrote:
> Hi Ede,
> 
> I can confirm what you say.
> 
> On a file with 1M simple shapes + 6 string attributes + 1 date attribute
> 
> shp : 23s (same time with or without attributes, 19s if the file set has no 
> shx)

wonder how shp reader deals with dates, it can't parse them via 
FlexibleDateParser, if there is no delay.

> jml : *18s* (17s without the date and *5mn with the date and before your 
> optimization !!!*)

Jukka's dataset had 3 date entries over 1.1Mil features, imagine the slow down
 
> json : 8s (date is interpreted as string, I could not convert the string 
> formatted as -MM-dd to a date through View/Edit schema plugins*)

weird, will have a look at it. but again, a complete stack would have helped ;)

..thx ede

> Michaël
> 
> Exception in thread "AWT-EventQueue-0" java.lang.IllegalArgumentException: 
> Cannot format given Object as a Date
> at java.text.DateFormat.format(DateFormat.java:310)
> at java.text.Format.format(Format.java:157)
> at 
> com.vividsolutions.jump.workbench.ui.AttributeTablePanel$MyTable$2.setValue(AttributeTablePanel.java:192)
> 
> 
> 

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Jump-pilot-devel mailing list
Jump-pilot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel


Re: [JPP-Devel] Slow parsing of JML files

2017-07-30 Thread Michaël Michaud

Hi Ede,

I can confirm what you say.

On a file with 1M simple shapes + 6 string attributes + 1 date attribute

shp : 23s (same time with or without attributes, 19s if the file set has 
no shx)


jml : *18s* (17s without the date and *5mn with the date and before your 
optimization !!!*)


json : 8s (date is interpreted as string, I could not convert the string 
formatted as -MM-dd to a date through View/Edit schema plugins*)


Michaël

Exception in thread "AWT-EventQueue-0" 
java.lang.IllegalArgumentException: Cannot format given Object as a Date

at java.text.DateFormat.format(DateFormat.java:310)
at java.text.Format.format(Format.java:157)
at 
com.vividsolutions.jump.workbench.ui.AttributeTablePanel$MyTable$2.setValue(AttributeTablePanel.java:192)




Le 30/07/2017 à 18:46, edgar.sol...@web.de a écrit :

Jukka,

sorry, r5482 is the latest. commit msg stays the same. ..ede

On 7/30/2017 18:44, edgar.sol...@web.de wrote:

Jukka,

can you try r4682?


Log Message:
---
speeding up JML/GML reader when reading time attributes by parsing them lazy 
(during access later)
- adding FlexibleFeature, FlexibleFeatureSchema
- porting GMLReader to use the flexible classes

..ede

On 7/29/2017 22:50, Rahkonen Jukka (MML) wrote:

Hi,

I took some timings with todays nightly build by opening my dataset with 
1101817 polygons and quite many attributes.

Opening time
JML: 25 minutes 14 sec
Shapefile: 1 minute 28 sec
GeoJSON: 45 seconds

Seems that there must be something suboptimal in the JML driver and that Ede 
did really nice work with the GeoJSON driver.

-Jukka-



Lähettäjä: edgar.sol...@web.de 
Lähetetty: 29. heinäkuuta 2017 22:25
Vastaanottaja: OpenJump develop and use
Aihe: Re: [JPP-Devel] Slow parsing of JML files

just spent some time with GMLReader, which is actually our JMLReader and 
according to my finding it get's some magnitudes faster when the date parsing 
is commented out (in GMLInputTemplate.getColumnValue()).

had a short look at the FlexibleDateParser and have the impression that regex 
patterns are not precompiled but recompiled on every usage. btw. that would be 
a classic in terms of parsing slowdowns.

..ede

On 7/29/2017 13:03, Michaël Michaud wrote:

Hi,

I also remembered that jml reader was quite slow compared with shp, but my last 
tests show me only slight differences.

(Maybe shapefile reader has slowed down with my last commits - attempts to make 
it more robust with shp coming from esri or qgis)

Now, reading a big file (888000 features) with complex polygons and 3 
attributes is just a bit longer with jml (38 vs 32s)
For a file containing 1M of simple features (squares), it is even faster with 
jml (14 vs 23). Reading more attributes seems longer with jml.

I would say :

reading complex geometries is much longer with jml
reading attributes is longer with jml
reading simple geometries is longer with shx/shp

Jukka, have you examples where jml is much longer that shp ?

Michaël


Le 28/07/2017 à 17:25, edgar.sol...@web.de a écrit :

On 28.07.2017 17:06, Rahkonen Jukka (MML) wrote:

Hi,

I noticed with a dataset having 1.2 million features with lots of attributes 
that OpenJUMP is rather slow in parsing its own native JML format. It takes 
about 30 minutes to open the file. Shapefiles are much faster but they do not 
suit me because long strings are truncated. Can the hardcore developers guess 
what is the bottle neck with JML/XML parsing and if there could be some place 
for improvements?


hi Jukka,

could you provide said dataset privately?

does it make a difference when you cut down the number of attributes?

..ede

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Jump-pilot-devel mailing list
Jump-pilot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Jump-pilot-devel mailing list
Jump-pilot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Jump-pilot-devel mailing list
Jump-pilot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel

--
Check out the vibrant tech community on one of the world's mos

Re: [JPP-Devel] Slow parsing of JML files

2017-07-30 Thread edgar . soldin
Jukka,

sorry, r5482 is the latest. commit msg stays the same. ..ede

On 7/30/2017 18:44, edgar.sol...@web.de wrote:
> Jukka,
> 
> can you try r4682?
> 
>> Log Message:
>> ---
>> speeding up JML/GML reader when reading time attributes by parsing them lazy 
>> (during access later)
>> - adding FlexibleFeature, FlexibleFeatureSchema
>> - porting GMLReader to use the flexible classes
> 
> ..ede
> 
> On 7/29/2017 22:50, Rahkonen Jukka (MML) wrote:
>> Hi,
>>
>> I took some timings with todays nightly build by opening my dataset with 
>> 1101817 polygons and quite many attributes.
>>
>> Opening time
>> JML: 25 minutes 14 sec
>> Shapefile: 1 minute 28 sec
>> GeoJSON: 45 seconds
>>
>> Seems that there must be something suboptimal in the JML driver and that Ede 
>> did really nice work with the GeoJSON driver.
>>
>> -Jukka-
>>
>>
>> ________________________
>> Lähettäjä: edgar.sol...@web.de 
>> Lähetetty: 29. heinäkuuta 2017 22:25
>> Vastaanottaja: OpenJump develop and use
>> Aihe: Re: [JPP-Devel] Slow parsing of JML files
>>
>> just spent some time with GMLReader, which is actually our JMLReader and 
>> according to my finding it get's some magnitudes faster when the date 
>> parsing is commented out (in GMLInputTemplate.getColumnValue()).
>>
>> had a short look at the FlexibleDateParser and have the impression that 
>> regex patterns are not precompiled but recompiled on every usage. btw. that 
>> would be a classic in terms of parsing slowdowns.
>>
>> ..ede
>>
>> On 7/29/2017 13:03, Michaël Michaud wrote:
>>> Hi,
>>>
>>> I also remembered that jml reader was quite slow compared with shp, but my 
>>> last tests show me only slight differences.
>>>
>>> (Maybe shapefile reader has slowed down with my last commits - attempts to 
>>> make it more robust with shp coming from esri or qgis)
>>>
>>> Now, reading a big file (888000 features) with complex polygons and 3 
>>> attributes is just a bit longer with jml (38 vs 32s)
>>> For a file containing 1M of simple features (squares), it is even faster 
>>> with jml (14 vs 23). Reading more attributes seems longer with jml.
>>>
>>> I would say :
>>>
>>> reading complex geometries is much longer with jml
>>> reading attributes is longer with jml
>>> reading simple geometries is longer with shx/shp
>>>
>>> Jukka, have you examples where jml is much longer that shp ?
>>>
>>> Michaël
>>>
>>>
>>> Le 28/07/2017 à 17:25, edgar.sol...@web.de a écrit :
>>>> On 28.07.2017 17:06, Rahkonen Jukka (MML) wrote:
>>>>> Hi,
>>>>>
>>>>> I noticed with a dataset having 1.2 million features with lots of 
>>>>> attributes that OpenJUMP is rather slow in parsing its own native JML 
>>>>> format. It takes about 30 minutes to open the file. Shapefiles are much 
>>>>> faster but they do not suit me because long strings are truncated. Can 
>>>>> the hardcore developers guess what is the bottle neck with JML/XML 
>>>>> parsing and if there could be some place for improvements?
>>>>>
>>>> hi Jukka,
>>>>
>>>> could you provide said dataset privately?
>>>>
>>>> does it make a difference when you cut down the number of attributes?
>>>>
>>>> ..ede
>>>>
>>>> --
>>>> Check out the vibrant tech community on one of the world's most
>>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>>> ___
>>>> Jump-pilot-devel mailing list
>>>> Jump-pilot-devel@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
>>>>
>>>
>>>
>>> --
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> ___
>>> Jump-pilot-devel mailing list
>>> Jump-pilot-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
>>
>>
>> --

Re: [JPP-Devel] Slow parsing of JML files

2017-07-30 Thread edgar . soldin
Jukka,

can you try r4682?

> Log Message:
> ---
> speeding up JML/GML reader when reading time attributes by parsing them lazy 
> (during access later)
> - adding FlexibleFeature, FlexibleFeatureSchema
> - porting GMLReader to use the flexible classes

..ede

On 7/29/2017 22:50, Rahkonen Jukka (MML) wrote:
> Hi,
> 
> I took some timings with todays nightly build by opening my dataset with 
> 1101817 polygons and quite many attributes.
> 
> Opening time
> JML: 25 minutes 14 sec
> Shapefile: 1 minute 28 sec
> GeoJSON: 45 seconds
> 
> Seems that there must be something suboptimal in the JML driver and that Ede 
> did really nice work with the GeoJSON driver.
> 
> -Jukka-
> 
> 
> 
> Lähettäjä: edgar.sol...@web.de 
> Lähetetty: 29. heinäkuuta 2017 22:25
> Vastaanottaja: OpenJump develop and use
> Aihe: Re: [JPP-Devel] Slow parsing of JML files
> 
> just spent some time with GMLReader, which is actually our JMLReader and 
> according to my finding it get's some magnitudes faster when the date parsing 
> is commented out (in GMLInputTemplate.getColumnValue()).
> 
> had a short look at the FlexibleDateParser and have the impression that regex 
> patterns are not precompiled but recompiled on every usage. btw. that would 
> be a classic in terms of parsing slowdowns.
> 
> ..ede
> 
> On 7/29/2017 13:03, Michaël Michaud wrote:
>> Hi,
>>
>> I also remembered that jml reader was quite slow compared with shp, but my 
>> last tests show me only slight differences.
>>
>> (Maybe shapefile reader has slowed down with my last commits - attempts to 
>> make it more robust with shp coming from esri or qgis)
>>
>> Now, reading a big file (888000 features) with complex polygons and 3 
>> attributes is just a bit longer with jml (38 vs 32s)
>> For a file containing 1M of simple features (squares), it is even faster 
>> with jml (14 vs 23). Reading more attributes seems longer with jml.
>>
>> I would say :
>>
>> reading complex geometries is much longer with jml
>> reading attributes is longer with jml
>> reading simple geometries is longer with shx/shp
>>
>> Jukka, have you examples where jml is much longer that shp ?
>>
>> Michaël
>>
>>
>> Le 28/07/2017 à 17:25, edgar.sol...@web.de a écrit :
>>> On 28.07.2017 17:06, Rahkonen Jukka (MML) wrote:
>>>> Hi,
>>>>
>>>> I noticed with a dataset having 1.2 million features with lots of 
>>>> attributes that OpenJUMP is rather slow in parsing its own native JML 
>>>> format. It takes about 30 minutes to open the file. Shapefiles are much 
>>>> faster but they do not suit me because long strings are truncated. Can the 
>>>> hardcore developers guess what is the bottle neck with JML/XML parsing and 
>>>> if there could be some place for improvements?
>>>>
>>> hi Jukka,
>>>
>>> could you provide said dataset privately?
>>>
>>> does it make a difference when you cut down the number of attributes?
>>>
>>> ..ede
>>>
>>> --
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> ___
>>> Jump-pilot-devel mailing list
>>> Jump-pilot-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
>>>
>>
>>
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Jump-pilot-devel mailing list
>> Jump-pilot-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
> 
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Jump-pilot-devel mailing list
> Jump-pilot-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Jump-pilot-devel mailing list
> Jump-pilot-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
> 


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Jump-pilot-devel mailing list
Jump-pilot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel


Re: [JPP-Devel] Slow parsing of JML files

2017-07-29 Thread edgar . soldin
Jukka,

thanks for the flowers :).. try the same w/o date attributes, that should speed 
up JML by magnitudes.

..ede

On 7/29/2017 22:50, Rahkonen Jukka (MML) wrote:
> Hi,
> 
> I took some timings with todays nightly build by opening my dataset with 
> 1101817 polygons and quite many attributes.
> 
> Opening time
> JML: 25 minutes 14 sec
> Shapefile: 1 minute 28 sec
> GeoJSON: 45 seconds
> 
> Seems that there must be something suboptimal in the JML driver and that Ede 
> did really nice work with the GeoJSON driver.
> 
> -Jukka-
> 
> 
> 
> Lähettäjä: edgar.sol...@web.de 
> Lähetetty: 29. heinäkuuta 2017 22:25
> Vastaanottaja: OpenJump develop and use
> Aihe: Re: [JPP-Devel] Slow parsing of JML files
> 
> just spent some time with GMLReader, which is actually our JMLReader and 
> according to my finding it get's some magnitudes faster when the date parsing 
> is commented out (in GMLInputTemplate.getColumnValue()).
> 
> had a short look at the FlexibleDateParser and have the impression that regex 
> patterns are not precompiled but recompiled on every usage. btw. that would 
> be a classic in terms of parsing slowdowns.
> 
> ..ede
> 
> On 7/29/2017 13:03, Michaël Michaud wrote:
>> Hi,
>>
>> I also remembered that jml reader was quite slow compared with shp, but my 
>> last tests show me only slight differences.
>>
>> (Maybe shapefile reader has slowed down with my last commits - attempts to 
>> make it more robust with shp coming from esri or qgis)
>>
>> Now, reading a big file (888000 features) with complex polygons and 3 
>> attributes is just a bit longer with jml (38 vs 32s)
>> For a file containing 1M of simple features (squares), it is even faster 
>> with jml (14 vs 23). Reading more attributes seems longer with jml.
>>
>> I would say :
>>
>> reading complex geometries is much longer with jml
>> reading attributes is longer with jml
>> reading simple geometries is longer with shx/shp
>>
>> Jukka, have you examples where jml is much longer that shp ?
>>
>> Michaël
>>
>>
>> Le 28/07/2017 à 17:25, edgar.sol...@web.de a écrit :
>>> On 28.07.2017 17:06, Rahkonen Jukka (MML) wrote:
>>>> Hi,
>>>>
>>>> I noticed with a dataset having 1.2 million features with lots of 
>>>> attributes that OpenJUMP is rather slow in parsing its own native JML 
>>>> format. It takes about 30 minutes to open the file. Shapefiles are much 
>>>> faster but they do not suit me because long strings are truncated. Can the 
>>>> hardcore developers guess what is the bottle neck with JML/XML parsing and 
>>>> if there could be some place for improvements?
>>>>
>>> hi Jukka,
>>>
>>> could you provide said dataset privately?
>>>
>>> does it make a difference when you cut down the number of attributes?
>>>
>>> ..ede
>>>
>>> --
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> ___
>>> Jump-pilot-devel mailing list
>>> Jump-pilot-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
>>>
>>
>>
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Jump-pilot-devel mailing list
>> Jump-pilot-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
> 
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Jump-pilot-devel mailing list
> Jump-pilot-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Jump-pilot-devel mailing list
> Jump-pilot-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
> 


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Jump-pilot-devel mailing list
Jump-pilot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel


Re: [JPP-Devel] Slow parsing of JML files

2017-07-29 Thread Rahkonen Jukka (MML)
Hi,

I took some timings with todays nightly build by opening my dataset with 
1101817 polygons and quite many attributes.

Opening time
JML: 25 minutes 14 sec
Shapefile: 1 minute 28 sec
GeoJSON: 45 seconds

Seems that there must be something suboptimal in the JML driver and that Ede 
did really nice work with the GeoJSON driver.

-Jukka-



Lähettäjä: edgar.sol...@web.de 
Lähetetty: 29. heinäkuuta 2017 22:25
Vastaanottaja: OpenJump develop and use
Aihe: Re: [JPP-Devel] Slow parsing of JML files

just spent some time with GMLReader, which is actually our JMLReader and 
according to my finding it get's some magnitudes faster when the date parsing 
is commented out (in GMLInputTemplate.getColumnValue()).

had a short look at the FlexibleDateParser and have the impression that regex 
patterns are not precompiled but recompiled on every usage. btw. that would be 
a classic in terms of parsing slowdowns.

..ede

On 7/29/2017 13:03, Michaël Michaud wrote:
> Hi,
>
> I also remembered that jml reader was quite slow compared with shp, but my 
> last tests show me only slight differences.
>
> (Maybe shapefile reader has slowed down with my last commits - attempts to 
> make it more robust with shp coming from esri or qgis)
>
> Now, reading a big file (888000 features) with complex polygons and 3 
> attributes is just a bit longer with jml (38 vs 32s)
> For a file containing 1M of simple features (squares), it is even faster with 
> jml (14 vs 23). Reading more attributes seems longer with jml.
>
> I would say :
>
> reading complex geometries is much longer with jml
> reading attributes is longer with jml
> reading simple geometries is longer with shx/shp
>
> Jukka, have you examples where jml is much longer that shp ?
>
> Michaël
>
>
> Le 28/07/2017 à 17:25, edgar.sol...@web.de a écrit :
>> On 28.07.2017 17:06, Rahkonen Jukka (MML) wrote:
>>> Hi,
>>>
>>> I noticed with a dataset having 1.2 million features with lots of 
>>> attributes that OpenJUMP is rather slow in parsing its own native JML 
>>> format. It takes about 30 minutes to open the file. Shapefiles are much 
>>> faster but they do not suit me because long strings are truncated. Can the 
>>> hardcore developers guess what is the bottle neck with JML/XML parsing and 
>>> if there could be some place for improvements?
>>>
>> hi Jukka,
>>
>> could you provide said dataset privately?
>>
>> does it make a difference when you cut down the number of attributes?
>>
>> ..ede
>>
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Jump-pilot-devel mailing list
>> Jump-pilot-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
>>
>
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Jump-pilot-devel mailing list
> Jump-pilot-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Jump-pilot-devel mailing list
Jump-pilot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Jump-pilot-devel mailing list
Jump-pilot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel


Re: [JPP-Devel] Slow parsing of JML files

2017-07-29 Thread edgar . soldin
just spent some time with GMLReader, which is actually our JMLReader and 
according to my finding it get's some magnitudes faster when the date parsing 
is commented out (in GMLInputTemplate.getColumnValue()).

had a short look at the FlexibleDateParser and have the impression that regex 
patterns are not precompiled but recompiled on every usage. btw. that would be 
a classic in terms of parsing slowdowns.

..ede 

On 7/29/2017 13:03, Michaël Michaud wrote:
> Hi,
> 
> I also remembered that jml reader was quite slow compared with shp, but my 
> last tests show me only slight differences.
> 
> (Maybe shapefile reader has slowed down with my last commits - attempts to 
> make it more robust with shp coming from esri or qgis)
> 
> Now, reading a big file (888000 features) with complex polygons and 3 
> attributes is just a bit longer with jml (38 vs 32s)
> For a file containing 1M of simple features (squares), it is even faster with 
> jml (14 vs 23). Reading more attributes seems longer with jml.
> 
> I would say :
> 
> reading complex geometries is much longer with jml
> reading attributes is longer with jml
> reading simple geometries is longer with shx/shp
> 
> Jukka, have you examples where jml is much longer that shp ?
> 
> Michaël
> 
> 
> Le 28/07/2017 à 17:25, edgar.sol...@web.de a écrit :
>> On 28.07.2017 17:06, Rahkonen Jukka (MML) wrote:
>>> Hi,
>>>
>>> I noticed with a dataset having 1.2 million features with lots of 
>>> attributes that OpenJUMP is rather slow in parsing its own native JML 
>>> format. It takes about 30 minutes to open the file. Shapefiles are much 
>>> faster but they do not suit me because long strings are truncated. Can the 
>>> hardcore developers guess what is the bottle neck with JML/XML parsing and 
>>> if there could be some place for improvements?
>>>
>> hi Jukka,
>>
>> could you provide said dataset privately?
>>
>> does it make a difference when you cut down the number of attributes?
>>
>> ..ede
>>
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Jump-pilot-devel mailing list
>> Jump-pilot-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
>>
> 
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Jump-pilot-devel mailing list
> Jump-pilot-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Jump-pilot-devel mailing list
Jump-pilot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel


Re: [JPP-Devel] Slow parsing of JML files

2017-07-29 Thread Michaël Michaud

Hi,

I also remembered that jml reader was quite slow compared with shp, but 
my last tests show me only slight differences.


(Maybe shapefile reader has slowed down with my last commits - attempts 
to make it more robust with shp coming from esri or qgis)


Now, reading a big file (888000 features) with complex polygons and 3 
attributes is just a bit longer with jml (38 vs 32s)
For a file containing 1M of simple features (squares), it is even faster 
with jml (14 vs 23). Reading more attributes seems longer with jml.


I would say :

reading complex geometries is much longer with jml
reading attributes is longer with jml
reading simple geometries is longer with shx/shp

Jukka, have you examples where jml is much longer that shp ?

Michaël


Le 28/07/2017 à 17:25, edgar.sol...@web.de a écrit :

On 28.07.2017 17:06, Rahkonen Jukka (MML) wrote:

Hi,

I noticed with a dataset having 1.2 million features with lots of attributes 
that OpenJUMP is rather slow in parsing its own native JML format. It takes 
about 30 minutes to open the file. Shapefiles are much faster but they do not 
suit me because long strings are truncated. Can the hardcore developers guess 
what is the bottle neck with JML/XML parsing and if there could be some place 
for improvements?


hi Jukka,

could you provide said dataset privately?

does it make a difference when you cut down the number of attributes?

..ede

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Jump-pilot-devel mailing list
Jump-pilot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Jump-pilot-devel mailing list
Jump-pilot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel


Re: [JPP-Devel] Slow parsing of JML files

2017-07-28 Thread edgar . soldin
On 28.07.2017 17:06, Rahkonen Jukka (MML) wrote:
> Hi,
> 
> I noticed with a dataset having 1.2 million features with lots of attributes 
> that OpenJUMP is rather slow in parsing its own native JML format. It takes 
> about 30 minutes to open the file. Shapefiles are much faster but they do not 
> suit me because long strings are truncated. Can the hardcore developers guess 
> what is the bottle neck with JML/XML parsing and if there could be some place 
> for improvements?
> 

hi Jukka,

could you provide said dataset privately? 

does it make a difference when you cut down the number of attributes?

..ede

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Jump-pilot-devel mailing list
Jump-pilot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel