Re: [Libreoffice-ux-advise] Pivot Table data provider extension framework (removal possibility)
On Thu, 2013-03-21 at 12:24 -0400, Kohei Yoshida wrote: I have a cunning idea. Since one of the difficulties on this is to reach out to the actual users of this functionality, I'd like to remove the 4th check box from the current pivot table data source selection dialog in 4.1 (and maybe 4.0.x if you guys agree) and see if anyone reports it as a bug. Sounds rather sensible to me :-) HTH, Michael. -- michael.me...@suse.com , Pseudo Engineer, itinerant idiot ___ Libreoffice-ux-advise mailing list Libreoffice-ux-advise@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice-ux-advise
Re: [Libreoffice-ux-advise] Pivot Table data provider extension framework (removal possibility)
On Wed, Mar 20, 2013 at 5:20 AM, Eike Rathke er...@redhat.com wrote: Hi Kohei, On Thursday, 2013-03-14 09:26:55 -0400, Kohei Yoshida wrote: I believe the same functionality can be achieve via database connectivity, by having such external data provider register as a database, and use it to act as a data provider for pivot tables. So, I don't see a reason why we need to keep this as a separate data source category. IMHO the advantage of the data provider is that the actual data does not have to reside in the spreadsheet, allowing for massive amounts of data records but providing only the information necessary for the pivot table. This maybe could be accomplished as well using a registered data source, but currently we have no means to pull the data without actually storing it in the spreadsheet for further processing. Or isn't that the case? Well, that would depend on what you actually mean by storing (the data) in the spreadsheet. When pulling data via database connectivity, we don't actually copy the data in the spreadsheet document, but generate the pivot table output directly from it. But we *do* first populate the pivot cache from the database internally, so a copy of the data will sit in memory while the document is open. That's my bad then. I assumed the data was stored in a DB range. Is that different with the data provider, i.e. does it not need to copy all data to populate the pivot cache with an interface to directly populate the layouted pivot table? Well, that's how it is implemented today. It's not per design but due to how this feature has evolved historically. This data provider interface was designed and put in place *before* we added this pivot cache backend. This difference actually causes additional headache, since we can't always assume that the pivot cache be populated, which ties our hands in many places in the pivot engine. Other advantages a data provider could have are a) be able to collect data from various e.g. remote sources that a simple data connection could not provide, Yes, but to achieve that, one has to implement the *whole pivot result calc engine*. To me that's an overkill, just to avoid implementing a simple data connectivity backend. It would be much simpler to just write a data connectivity backend and re-use the database connectivity backend of the pivot table. and b) access data in means not possible with database connectivity, for example if the user shall be restricted to a subset of a database or not be able to query using SQL statements. Sure. But I'm sure that could be implemented via some sort of data connectivity proxy, which to me would be much simpler than developing the entire result calc engine from scratch. Probably there'll always be _some_ use cases such a provider could have (does Excel have that? if yes then there are ...), Unless I missed something (someone could enlighten me on this), Excel only provides MS SQL connectivity which is equivalent of our database connectivity backend. so if it's ripped out maybe offering a new interface adapted to the new data types and structures that sits on top of the engine instead of being part of it would be good. Sure. But to justify this enormous design constraint, I'd like to hear from the actual users / deployers about why this special data provider was needed in the first place, so that their requirement still will justify the complexity it imposes on 100% of users of pivot table, including those who don't use this data provider backend (which I imagine constitutes 99.9% of all pivot table users). Kohei ___ Libreoffice-ux-advise mailing list Libreoffice-ux-advise@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice-ux-advise
Re: [Libreoffice-ux-advise] Pivot Table data provider extension framework (removal possibility)
Hi Kohei, On Thursday, 2013-03-14 09:26:55 -0400, Kohei Yoshida wrote: I believe the same functionality can be achieve via database connectivity, by having such external data provider register as a database, and use it to act as a data provider for pivot tables. So, I don't see a reason why we need to keep this as a separate data source category. IMHO the advantage of the data provider is that the actual data does not have to reside in the spreadsheet, allowing for massive amounts of data records but providing only the information necessary for the pivot table. This maybe could be accomplished as well using a registered data source, but currently we have no means to pull the data without actually storing it in the spreadsheet for further processing. Or isn't that the case? Well, that would depend on what you actually mean by storing (the data) in the spreadsheet. When pulling data via database connectivity, we don't actually copy the data in the spreadsheet document, but generate the pivot table output directly from it. But we *do* first populate the pivot cache from the database internally, so a copy of the data will sit in memory while the document is open. That's my bad then. I assumed the data was stored in a DB range. Is that different with the data provider, i.e. does it not need to copy all data to populate the pivot cache with an interface to directly populate the layouted pivot table? Other advantages a data provider could have are a) be able to collect data from various e.g. remote sources that a simple data connection could not provide, and b) access data in means not possible with database connectivity, for example if the user shall be restricted to a subset of a database or not be able to query using SQL statements. Probably there'll always be _some_ use cases such a provider could have (does Excel have that? if yes then there are ...), so if it's ripped out maybe offering a new interface adapted to the new data types and structures that sits on top of the engine instead of being part of it would be good. Eike -- LibreOffice Calc developer. Number formatter stricken i18n transpositionizer. New GnuPG key 0x65632D3A : 2265 D7F3 A7B0 95CC 3918 630B 6A6C D5B7 6563 2D3A Old GnuPG key 0x293C05FD : 997A 4C60 CE41 0149 0DB3 9E96 2F1A D073 293C 05FD Support the FSFE, care about Free Software! https://fsfe.org/support/?erack pgpkruy73KUFk.pgp Description: PGP signature ___ Libreoffice-ux-advise mailing list Libreoffice-ux-advise@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice-ux-advise
Re: [Libreoffice-ux-advise] Pivot Table data provider extension framework (removal possibility)
Hi Eike, Thanks for your reply. On Wed, Mar 13, 2013 at 4:48 PM, Eike Rathke er...@redhat.com wrote: Hi Kohei, On Tuesday, 2013-03-12 11:41:32 -0400, Kohei Yoshida wrote: I'd like to ask whether someone actually uses this Pivot Table data provider extension framework, because I'd like to remove this if nobody is using it, or only few people are using it. From what I remember that can be used to populate pivot tables with data obtained from external resources like databases. Unfortunately you'll hardly find such extensions in the wild but more within enterprises and corporate users, so determining whether it's actually used or not is nearly impossible unless someone knows who those customers are. Understood. I imagined it would be used only in such enterprise setting, by someone with enough resources to develop the major part of the pivot engine as an extension. I believe the same functionality can be achieve via database connectivity, by having such external data provider register as a database, and use it to act as a data provider for pivot tables. So, I don't see a reason why we need to keep this as a separate data source category. IMHO the advantage of the data provider is that the actual data does not have to reside in the spreadsheet, allowing for massive amounts of data records but providing only the information necessary for the pivot table. This maybe could be accomplished as well using a registered data source, but currently we have no means to pull the data without actually storing it in the spreadsheet for further processing. Or isn't that the case? Well, that would depend on what you actually mean by storing (the data) in the spreadsheet. When pulling data via database connectivity, we don't actually copy the data in the spreadsheet document, but generate the pivot table output directly from it. But we *do* first populate the pivot cache from the database internally, so a copy of the data will sit in memory while the document is open. I consider the pivot cache part an implementation detail, so I'm not sure if that's what you meant by storing it in the spreadsheet... The way it is currently implemented also makes it *extremely* difficult for us to optimize the pivot table engine, because all its functionality has to go through the UNO API which forces us to do data conversion *twice* for every single transaction. That's very very expensive especially as the data size grows (and it always does). Seconded. And this to me is a considerable disadvantage on further speeding up the engine and reducing its memory usage. So, I'd *love* to get rid of this sooner rather than later, and I'd like to know whether there are people who would absolutely need this functionality, and if so why. As I said above, I believe the same functionality could be achieved via the database connectivity backend even if we remove the extension backend. I think there work needs to be done to pull the data and provide it in a form that pivot tables can actually process. It may be viable, but I'm really not familiar with pivot table topics. Yes. So, anyone who currently use this data provider extension backend would change the way the data is connected to Calc's pivot table; which will require *some* work. But, considering that developing such an extension requires a non-trivial resource (it's almost half o the whole pivot table engine), I would imagine they could spare a bit of their resource to set up database connectivity to achieve what they need... At least that's what I'm hoping. Kohei ___ Libreoffice-ux-advise mailing list Libreoffice-ux-advise@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice-ux-advise
Re: [Libreoffice-ux-advise] Pivot Table data provider extension framework (removal possibility)
Hi Kohei, Eike, I'd like to ask whether someone actually uses this Pivot Table data provider extension framework, because I'd like to remove this if nobody is using it, or only few people are using it. Hm, a list that is decidedly low-volume is probably the wrong list here... I don't know, maybe the users list is a better fit..? Astron. ___ Libreoffice-ux-advise mailing list Libreoffice-ux-advise@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice-ux-advise
[Libreoffice-ux-advise] Pivot Table data provider extension framework (removal possibility)
Hi there, I'd like to ask whether someone actually uses this Pivot Table data provider extension framework, because I'd like to remove this if nobody is using it, or only few people are using it. Currently, Calc's pivot table supports 4 different backends. They are: 1. cell range on sheet 2. named range on sheet 3. database (registered via database manager) 4. extension acting as a data provider. The 4th one is what I'd like to get rid of. When creating a pivot table via Data Pivot Table Create..., you'll get a dialog with these 4 choices. The 4th one, labeled External source/interface is usually disabled *unless* you have extension installed that implements all pivot table interfaces necessary to act as a data provider. These are UNO interfaces that are recently *un*-published in the 4.0 release. On this page: https://wiki.documentfoundation.org/ReleaseNotes/4.0 All UNO services/interfaces/etc starting with com.sun.star.sheet.DataPilotSource and below are the ones that are relevant for this data provider functionality. I believe the same functionality can be achieve via database connectivity, by having such external data provider register as a database, and use it to act as a data provider for pivot tables. So, I don't see a reason why we need to keep this as a separate data source category. The way it is currently implemented also makes it *extremely* difficult for us to optimize the pivot table engine, because all its functionality has to go through the UNO API which forces us to do data conversion *twice* for every single transaction. That's very very expensive especially as the data size grows (and it always does). So, I'd *love* to get rid of this sooner rather than later, and I'd like to know whether there are people who would absolutely need this functionality, and if so why. As I said above, I believe the same functionality could be achieved via the database connectivity backend even if we remove the extension backend. Thanks, -- Kohei Yoshida, LibreOffice hacker, Calc ___ Libreoffice-ux-advise mailing list Libreoffice-ux-advise@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice-ux-advise