In regards to returning -1, I believe the relevant count methods are detailed below. It may make sense, if the API is changing anyway, to allow for an 'exact' (or equivalent) parameter to force the count evaluation, and remove the somewhat un-intuitive (to me at least) differentiation between count and size.

Thanks,

Emilio


DataFeatureCollection: (note: no javadoc, but the implication is that it can return -1 to align with FeatureSource)

  public abstract int getCount() throws IOException;

FeatureCollection: (may *not* return -1)

  /**
     * Please note this operation may be expensive when working with remote content.
     *
     * @see java.util.Collection#size()
     */
    int size();

FeatureSource: (may return -1)

 /**
     * Gets the number of the features that would be returned by the given
     * {@code Query}, taking into account any settings for max features and
     * start index set on the {@code Query}.
     * <p>
     * It is possible that this method will return {@code -1} if the calculation      * of number of features is judged to be too costly by the implementing class.
     * In this case, you might call <code>getFeatures(query).size()</code>
     * instead.
     * <p>
     * Example use:<pre><code> int count = featureSource.getCount();
     * if( count == -1 ){
     *    count = featureSource.getFeatures( "typeName", count ).size();
     * }
     *
     * @param query the query to select features
     *
     * @return the numer of features that would be returned by the {@code Query};
     *         or {@code -1} if this cannot be calculated.
     *
     * @throws IOException if there are errors getting the count
     */
    int getCount(Query query) throws IOException;



On 9/2/20 2:10 PM, Jim Hughes wrote:

Hi all,

The JavaDoc on this method reminded me of one of the points I wanted to suggest.  If there are multiple methods for counting records, it may be good to discuss the semantics.  If I recall, some of the methods in GeoTools have the idea that returning a -1 is suitable way to communicate that getting the exact count would be too expensive.

As an anchor point, I'm kinda happy when I see a distributed system scanning over a million-ish records per second.  With that back of the envelope, if a filter matched a billion or so features, it may take 10-15 minutes to get an exact count.

In GeoMesa, we implemented a system property around returning exact counts or not.  I think we noticed this the most when GeoServer returned GeoJson since that request pathway calls getCount and then getFeatures (which is GeoMesa's case basically repeats the query!).

Anyhow, having an 'int' method and a 'long' method to call may help limit the amount of time spent counting;).

Cheers,

Jim

On 9/2/2020 1:54 PM, Andrea Aime wrote:
Hi Jody,
I like this road, works fine for FeatureSource.
What about FeatureCollection though? It's already using "int size()", from the interface:

/** * Please note this operation may be expensive when working with remote content. * * @see java.util.Collection#size() */ int size();

Cheers
Andrea


On Wed, Sep 2, 2020 at 7:45 PM Jody Garnett <jody.garn...@gmail.com <mailto:jody.garn...@gmail.com>> wrote:

    Here is another softer approach:
    /**
     * @return Returns the number of features in this collection, if
    this collection contains more than Integer.MAX_VALUE elements,
    returns Integer.MAX_VALUE.
     * @deprecated Please use count()
     */
    public int getCount();

    public long size();

    This gives us a clear api migration and does not immediately
    break projects when they upgrade. The size() name matches how
    Collections.size() handles a size greater than the range of
    Integer.MAX_VALUE.

    --
    Jody Garnett


    On Wed, 2 Sep 2020 at 06:43, Andrea Aime
    <andrea.a...@geo-solutions.it
    <mailto:andrea.a...@geo-solutions.it>> wrote:

        Hi, any other opinion on this?

        Personally I would not like breaking all existing store
        implementations and clients, for a "clean break" it seems
        quite bloody :-D
        But Jody suggests to go that way.

        Some tie breaker, or even multiple votes leading to another
        tie, would be appreciated, ha!

        Cheers
        Andrea

        On Thu, Aug 27, 2020 at 11:16 PM Andrea Aime
        <andrea.a...@geo-solutions.it
        <mailto:andrea.a...@geo-solutions.it>> wrote:

            Hi Jody,
            comment inline.

            On Thu, Aug 27, 2020 at 11:06 PM Jody Garnett
            <jody.garn...@gmail.com <mailto:jody.garn...@gmail.com>>
            wrote:

                We are just about to start a new release cycle, API
                changes are a short-term pain, but the most
                maintainable approach long term.

                As for the change, how about returning a returning
                long? Existing client code that used integrer would
                be easy to update.

                    long count = featureSource.getCount(query);

                Or:

                    int count = (int) featureSource.getCount(query);


            We can gauge how easy it is by doing the switch in GT/GS...
            Maybe it could be done as a refactor... but I cannot
            imagine exactly how yet. Closest thing to something
            working may be:

            1) Rename existing getCount to getCountOld via refactor
            2) Add a new method called getCount, returning long, make
            it abstract, have a default implementation of getCountOld
            delegating to getCount
            3) Fix all implementations, switching them from
            getCountOld to getCount
            4) Inline getCountOld, that should fix all calling points

            Hopefully that should not be too much work, I hope there
            are few implementations of FeatureSource/Collection but many
            client code calls using them.


                If you really want Java 8 has Math.toIntExact
                method, that produces an exception if the long is out
                of range:

                    int count = Math. toIntExact(
                    featureSource.getCount(query) );


            That would also work yes

            Cheers
            Andrea

            == GeoServer Professional Services from the experts!
            Visit http://goo.gl/it488V for more information. == Ing.
            Andrea Aime @geowolf Technical Lead GeoSolutions S.A.S.
            Via di Montramito 3/A 55054 Massarosa (LU) phone: +39
            0584 962313 fax: +39 0584 1660272 mob: +39 339 8844549
            http://www.geo-solutions.it
            http://twitter.com/geosolutions_it
            -------------------------------------------------------
            /Con riferimento alla normativa sul trattamento dei dati
            personali (Reg. UE 2016/679 - Regolamento generale sulla
            protezione dei dati “GDPR”), si precisa che ogni
            circostanza inerente alla presente email (il suo
            contenuto, gli eventuali allegati, etc.) è un dato la cui
            conoscenza è riservata al/i solo/i destinatario/i
            indicati dallo scrivente. Se il messaggio Le è giunto per
            errore, è tenuta/o a cancellarlo, ogni altra operazione è
            illecita. Le sarei comunque grato se potesse darmene
            notizia. This email is intended only for the person or
            entity to which it is addressed and may contain
            information that is privileged, confidential or otherwise
            protected from disclosure. We remind that - as provided
            by European Regulation 2016/679 “GDPR” - copying,
            dissemination or use of this e-mail or the information
            herein by anyone other than the intended recipient is
            prohibited. If you have received this email by mistake,
            please notify us immediately by telephone or e-mail./



--
        Regards, Andrea Aime

        == GeoServer Professional Services from the experts! Visit
        http://goo.gl/it488V for more information. == Ing. Andrea
        Aime @geowolf Technical Lead GeoSolutions S.A.S. Via di
        Montramito 3/A 55054 Massarosa (LU) phone: +39 0584 962313
        fax: +39 0584 1660272 mob: +39 339 8844549
        http://www.geo-solutions.it
        http://twitter.com/geosolutions_it
        ------------------------------------------------------- /Con
        riferimento alla normativa sul trattamento dei dati personali
        (Reg. UE 2016/679 - Regolamento generale sulla protezione dei
        dati “GDPR”), si precisa che ogni circostanza inerente alla
        presente email (il suo contenuto, gli eventuali allegati,
        etc.) è un dato la cui conoscenza è riservata al/i solo/i
        destinatario/i indicati dallo scrivente. Se il messaggio Le è
        giunto per errore, è tenuta/o a cancellarlo, ogni altra
        operazione è illecita. Le sarei comunque grato se potesse
        darmene notizia. This email is intended only for the person
        or entity to which it is addressed and may contain
        information that is privileged, confidential or otherwise
        protected from disclosure. We remind that - as provided by
        European Regulation 2016/679 “GDPR” - copying, dissemination
        or use of this e-mail or the information herein by anyone
        other than the intended recipient is prohibited. If you have
        received this email by mistake, please notify us immediately
        by telephone or e-mail./



--

Regards, Andrea Aime

== GeoServer Professional Services from the experts! Visit http://goo.gl/it488V for more information. == Ing. Andrea Aime @geowolf Technical Lead GeoSolutions S.A.S. Via di Montramito 3/A 55054 Massarosa (LU) phone: +39 0584 962313 fax: +39 0584 1660272 mob: +39 339 8844549 http://www.geo-solutions.it http://twitter.com/geosolutions_it ------------------------------------------------------- /Con riferimento alla normativa sul trattamento dei dati personali (Reg. UE 2016/679 - Regolamento generale sulla protezione dei dati “GDPR”), si precisa che ogni circostanza inerente alla presente email (il suo contenuto, gli eventuali allegati, etc.) è un dato la cui conoscenza è riservata al/i solo/i destinatario/i indicati dallo scrivente. Se il messaggio Le è giunto per errore, è tenuta/o a cancellarlo, ogni altra operazione è illecita. Le sarei comunque grato se potesse darmene notizia. This email is intended only for the person or entity to which it is addressed and may contain information that is privileged, confidential or otherwise protected from disclosure. We remind that - as provided by European Regulation 2016/679 “GDPR” - copying, dissemination or use of this e-mail or the information herein by anyone other than the intended recipient is prohibited. If you have received this email by mistake, please notify us immediately by telephone or e-mail./



_______________________________________________
GeoTools-Devel mailing list
GeoTools-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geotools-devel


_______________________________________________
GeoTools-Devel mailing list
GeoTools-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geotools-devel

_______________________________________________
GeoTools-Devel mailing list
GeoTools-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geotools-devel

Reply via email to