Re: [Geotools-devel] Raster classifier operation for large inputs
Ahhh sorry, totally missed that, my apologies, too much multitasking on my part :) Anyways, +1 from me fwiw. On Wed, Nov 14, 2018 at 11:59 AM Andrea Aime wrote: > On Wed, Nov 14, 2018 at 7:54 PM Justin Deoliveira > wrote: > >> Hey Andrea, >> All of your changes sound good to me. Only question I have is whether >> your proposed change will replace what is there? Or is your thought to add >> some config parameter that would trigger the histogram / approximation >> based method? >> > > New entry in the methods enum, to be used from the caller when approximate > calcuation is desirable. > Citing from my initial mail (yes, it was a bit of a wall of text): > > " Ideally, these would be new entries in the ClassificationMethod > enumeration, say > QUANTILE_HISTOGRAM and NATURAL_BREAKS_HISTOGRAM, and ClassBreaksOpImage > would have an > extra optional parameter to decide the bucket count (with some reasonable > defaults, e.g. 256 for byte data, > 1000 for any other type)." > > >> As for moving the code to jai-text definitely makes sense to me. >> > > Great, thanks for following up! > > Cheers > Andrea > > == > > GeoServer Professional Services from the experts! Visit > http://goo.gl/it488V for more information. == Ing. Andrea Aime @geowolf > Technical Lead GeoSolutions S.A.S. Via di Montramito 3/A 55054 Massarosa > (LU) phone: +39 0584 962313 fax: +39 0584 1660272 mob: +39 339 8844549 > http://www.geo-solutions.it http://twitter.com/geosolutions_it > --- *Con riferimento > alla normativa sul trattamento dei dati personali (Reg. UE 2016/679 - > Regolamento generale sulla protezione dei dati “GDPR”), si precisa che ogni > circostanza inerente alla presente email (il suo contenuto, gli eventuali > allegati, etc.) è un dato la cui conoscenza è riservata al/i solo/i > destinatario/i indicati dallo scrivente. Se il messaggio Le è giunto per > errore, è tenuta/o a cancellarlo, ogni altra operazione è illecita. Le > sarei comunque grato se potesse darmene notizia. This email is intended > only for the person or entity to which it is addressed and may contain > information that is privileged, confidential or otherwise protected from > disclosure. We remind that - as provided by European Regulation 2016/679 > “GDPR” - copying, dissemination or use of this e-mail or the information > herein by anyone other than the intended recipient is prohibited. If you > have received this email by mistake, please notify us immediately by > telephone or e-mail.* > ___ GeoTools-Devel mailing list GeoTools-Devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/geotools-devel
Re: [Geotools-devel] Raster classifier operation for large inputs
On Wed, Nov 14, 2018 at 7:54 PM Justin Deoliveira wrote: > Hey Andrea, > All of your changes sound good to me. Only question I have is whether your > proposed change will replace what is there? Or is your thought to add some > config parameter that would trigger the histogram / approximation based > method? > New entry in the methods enum, to be used from the caller when approximate calcuation is desirable. Citing from my initial mail (yes, it was a bit of a wall of text): " Ideally, these would be new entries in the ClassificationMethod enumeration, say QUANTILE_HISTOGRAM and NATURAL_BREAKS_HISTOGRAM, and ClassBreaksOpImage would have an extra optional parameter to decide the bucket count (with some reasonable defaults, e.g. 256 for byte data, 1000 for any other type)." > As for moving the code to jai-text definitely makes sense to me. > Great, thanks for following up! Cheers Andrea == GeoServer Professional Services from the experts! Visit http://goo.gl/it488V for more information. == Ing. Andrea Aime @geowolf Technical Lead GeoSolutions S.A.S. Via di Montramito 3/A 55054 Massarosa (LU) phone: +39 0584 962313 fax: +39 0584 1660272 mob: +39 339 8844549 http://www.geo-solutions.it http://twitter.com/geosolutions_it --- *Con riferimento alla normativa sul trattamento dei dati personali (Reg. UE 2016/679 - Regolamento generale sulla protezione dei dati “GDPR”), si precisa che ogni circostanza inerente alla presente email (il suo contenuto, gli eventuali allegati, etc.) è un dato la cui conoscenza è riservata al/i solo/i destinatario/i indicati dallo scrivente. Se il messaggio Le è giunto per errore, è tenuta/o a cancellarlo, ogni altra operazione è illecita. Le sarei comunque grato se potesse darmene notizia. This email is intended only for the person or entity to which it is addressed and may contain information that is privileged, confidential or otherwise protected from disclosure. We remind that - as provided by European Regulation 2016/679 “GDPR” - copying, dissemination or use of this e-mail or the information herein by anyone other than the intended recipient is prohibited. If you have received this email by mistake, please notify us immediately by telephone or e-mail.* ___ GeoTools-Devel mailing list GeoTools-Devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/geotools-devel
Re: [Geotools-devel] Raster classifier operation for large inputs
For what it's worth, I like the plan. Improving current code, making it shareable between multiple parts of the codebase by pushing back to JAI-Ext It will be nice to have in the SLDService. Regards, Simone Giannecchini == GeoServer Professional Services from the experts! Visit http://goo.gl/it488V for more information. == Ing. Simone Giannecchini @simogeo Founder/Director GeoSolutions S.A.S. Via di Montramito 3/A 55054 Massarosa (LU) Italy phone: +39 0584 962313 fax: +39 0584 1660272 mob: +39 333 8128928 http://www.geo-solutions.it http://twitter.com/geosolutions_it --- Con riferimento alla normativa sul trattamento dei dati personali (Reg. UE 2016/679 - Regolamento generale sulla protezione dei dati “GDPR”), si precisa che ogni circostanza inerente alla presente email (il suo contenuto, gli eventuali allegati, etc.) è un dato la cui conoscenza è riservata al/i solo/i destinatario/i indicati dallo scrivente. Se il messaggio Le è giunto per errore, è tenuta/o a cancellarlo, ogni altra operazione è illecita. Le sarei comunque grato se potesse darmene notizia. This email is intended only for the person or entity to which it is addressed and may contain information that is privileged, confidential or otherwise protected from disclosure. We remind that - as provided by European Regulation 2016/679 “GDPR” - copying, dissemination or use of this e-mail or the information herein by anyone other than the intended recipient is prohibited. If you have received this email by mistake, please notify us immediately by telephone or e-mail. On Wed, Nov 14, 2018 at 5:01 PM Andrea Aime wrote: > Hi, > I'm looking into extending the GeoServer SLDService API to work against > raster data too. > The current code in that module works off the vector classification > functions for equal intervals, natural breaks and quantiles. > > When looking at extending it for rasters, I stumbled into > the ClassBreaksOpImage and its subclasses, which > does more or less what I need... with a hitch though: the rasters that I'm > playing with can be large and have floats/doubles > > Looking at the implementation for quantilies and natural breaks I've > noticed that all input values get collected > either in a List or a Map, where the doubles are > the values and the integer is a pixel count. > Mind, this is the same as vector code is doing, but getting to a million > of those in raster space only requires a 1000x1000 > image... and millions of double values (or map entries) take up a lot of > space. I could look into using non boxed > variants, but the issue is not really that one, it's just that keeping > track of all values requires too much space. > > So I'd like to add an approximate calculator instead that collects > histograms, and the works off the result applying the > same logic as today. Ideally, these would be new entries in > the ClassificationMethod enumeration, say > QUANTILE_HISTOGRAM and NATURAL_BREAKS_HISTOGRAM, and ClassBreaksOpImage > would have an > extra optional parameter to decide the bucket count (with some reasonable > defaults, e.g. 256 for byte data, > 1000 for any other type). > Working off histograms has a clear benefit, the size of the working memory > is fixed at the start, and it's possible > to use primitives in the data structure, of course it also means the > resulting classification won't be exact, but > should be close enough. > The downside is that the min/max values need to be known in advance to > build the buckets, so for the histogram > based methods the "extrema" parameter in the ClassBreaksOpImage will be > mandatory (exception thrown if not provided). > > How does that sound? > > Cheers > Andrea > > PS: most operations are in jai-ext, mind if the ClassBreaksOpImage gets > moved there, in its own module? > > -- > > Regards, Andrea Aime == GeoServer Professional Services from the experts! > Visit http://goo.gl/it488V for more information. == Ing. Andrea Aime > @geowolf Technical Lead GeoSolutions S.A.S. Via di Montramito 3/A 55054 > Massarosa (LU) phone: +39 0584 962313 fax: +39 0584 1660272 mob: +39 339 > 8844549 http://www.geo-solutions.it http://twitter.com/geosolutions_it > --- *Con riferimento > alla normativa sul trattamento dei dati personali (Reg. UE 2016/679 - > Regolamento generale sulla protezione dei dati “GDPR”), si precisa che ogni > circostanza inerente alla presente email (il suo contenuto, gli eventuali > allegati, etc.) è un dato la cui conoscenza è riservata al/i solo/i > destinatario/i indicati dallo scrivente. Se il messaggio Le è giunto per > errore, è tenuta/o a cancellarlo, ogni altra operazione è illecita. Le > sarei comunque grato se potesse darmene notizia. This email is intended > only for the person or entity to which it is addressed and may contain > information that is privileged, confidential or otherwise protected from > disclosure. We remind that
Re: [Geotools-devel] Raster classifier operation for large inputs
Hey Andrea, All of your changes sound good to me. Only question I have is whether your proposed change will replace what is there? Or is your thought to add some config parameter that would trigger the histogram / approximation based method? As for moving the code to jai-text definitely makes sense to me. -Justin On Wed, Nov 14, 2018 at 10:00 AM Andrea Aime wrote: > Hi, > I'm looking into extending the GeoServer SLDService API to work against > raster data too. > The current code in that module works off the vector classification > functions for equal intervals, natural breaks and quantiles. > > When looking at extending it for rasters, I stumbled into > the ClassBreaksOpImage and its subclasses, which > does more or less what I need... with a hitch though: the rasters that I'm > playing with can be large and have floats/doubles > > Looking at the implementation for quantilies and natural breaks I've > noticed that all input values get collected > either in a List or a Map, where the doubles are > the values and the integer is a pixel count. > Mind, this is the same as vector code is doing, but getting to a million > of those in raster space only requires a 1000x1000 > image... and millions of double values (or map entries) take up a lot of > space. I could look into using non boxed > variants, but the issue is not really that one, it's just that keeping > track of all values requires too much space. > > So I'd like to add an approximate calculator instead that collects > histograms, and the works off the result applying the > same logic as today. Ideally, these would be new entries in > the ClassificationMethod enumeration, say > QUANTILE_HISTOGRAM and NATURAL_BREAKS_HISTOGRAM, and ClassBreaksOpImage > would have an > extra optional parameter to decide the bucket count (with some reasonable > defaults, e.g. 256 for byte data, > 1000 for any other type). > Working off histograms has a clear benefit, the size of the working memory > is fixed at the start, and it's possible > to use primitives in the data structure, of course it also means the > resulting classification won't be exact, but > should be close enough. > The downside is that the min/max values need to be known in advance to > build the buckets, so for the histogram > based methods the "extrema" parameter in the ClassBreaksOpImage will be > mandatory (exception thrown if not provided). > > How does that sound? > > Cheers > Andrea > > PS: most operations are in jai-ext, mind if the ClassBreaksOpImage gets > moved there, in its own module? > > -- > > Regards, Andrea Aime == GeoServer Professional Services from the experts! > Visit http://goo.gl/it488V for more information. == Ing. Andrea Aime > @geowolf Technical Lead GeoSolutions S.A.S. Via di Montramito 3/A 55054 > Massarosa (LU) phone: +39 0584 962313 fax: +39 0584 1660272 mob: +39 339 > 8844549 http://www.geo-solutions.it http://twitter.com/geosolutions_it > --- *Con riferimento > alla normativa sul trattamento dei dati personali (Reg. UE 2016/679 - > Regolamento generale sulla protezione dei dati “GDPR”), si precisa che ogni > circostanza inerente alla presente email (il suo contenuto, gli eventuali > allegati, etc.) è un dato la cui conoscenza è riservata al/i solo/i > destinatario/i indicati dallo scrivente. Se il messaggio Le è giunto per > errore, è tenuta/o a cancellarlo, ogni altra operazione è illecita. Le > sarei comunque grato se potesse darmene notizia. This email is intended > only for the person or entity to which it is addressed and may contain > information that is privileged, confidential or otherwise protected from > disclosure. We remind that - as provided by European Regulation 2016/679 > “GDPR” - copying, dissemination or use of this e-mail or the information > herein by anyone other than the intended recipient is prohibited. If you > have received this email by mistake, please notify us immediately by > telephone or e-mail.* > ___ GeoTools-Devel mailing list GeoTools-Devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/geotools-devel
[Geotools-devel] Raster classifier operation for large inputs
Hi, I'm looking into extending the GeoServer SLDService API to work against raster data too. The current code in that module works off the vector classification functions for equal intervals, natural breaks and quantiles. When looking at extending it for rasters, I stumbled into the ClassBreaksOpImage and its subclasses, which does more or less what I need... with a hitch though: the rasters that I'm playing with can be large and have floats/doubles Looking at the implementation for quantilies and natural breaks I've noticed that all input values get collected either in a List or a Map, where the doubles are the values and the integer is a pixel count. Mind, this is the same as vector code is doing, but getting to a million of those in raster space only requires a 1000x1000 image... and millions of double values (or map entries) take up a lot of space. I could look into using non boxed variants, but the issue is not really that one, it's just that keeping track of all values requires too much space. So I'd like to add an approximate calculator instead that collects histograms, and the works off the result applying the same logic as today. Ideally, these would be new entries in the ClassificationMethod enumeration, say QUANTILE_HISTOGRAM and NATURAL_BREAKS_HISTOGRAM, and ClassBreaksOpImage would have an extra optional parameter to decide the bucket count (with some reasonable defaults, e.g. 256 for byte data, 1000 for any other type). Working off histograms has a clear benefit, the size of the working memory is fixed at the start, and it's possible to use primitives in the data structure, of course it also means the resulting classification won't be exact, but should be close enough. The downside is that the min/max values need to be known in advance to build the buckets, so for the histogram based methods the "extrema" parameter in the ClassBreaksOpImage will be mandatory (exception thrown if not provided). How does that sound? Cheers Andrea PS: most operations are in jai-ext, mind if the ClassBreaksOpImage gets moved there, in its own module? -- Regards, Andrea Aime == GeoServer Professional Services from the experts! Visit http://goo.gl/it488V for more information. == Ing. Andrea Aime @geowolf Technical Lead GeoSolutions S.A.S. Via di Montramito 3/A 55054 Massarosa (LU) phone: +39 0584 962313 fax: +39 0584 1660272 mob: +39 339 8844549 http://www.geo-solutions.it http://twitter.com/geosolutions_it --- *Con riferimento alla normativa sul trattamento dei dati personali (Reg. UE 2016/679 - Regolamento generale sulla protezione dei dati “GDPR”), si precisa che ogni circostanza inerente alla presente email (il suo contenuto, gli eventuali allegati, etc.) è un dato la cui conoscenza è riservata al/i solo/i destinatario/i indicati dallo scrivente. Se il messaggio Le è giunto per errore, è tenuta/o a cancellarlo, ogni altra operazione è illecita. Le sarei comunque grato se potesse darmene notizia. This email is intended only for the person or entity to which it is addressed and may contain information that is privileged, confidential or otherwise protected from disclosure. We remind that - as provided by European Regulation 2016/679 “GDPR” - copying, dissemination or use of this e-mail or the information herein by anyone other than the intended recipient is prohibited. If you have received this email by mistake, please notify us immediately by telephone or e-mail.* ___ GeoTools-Devel mailing list GeoTools-Devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/geotools-devel