Re: KVP and data that contains a forward slash

2016-02-11 Thread Derek Atkins
John,

John Ralls  writes:

>> Originally the KVP path was supposed to be a file-system.  So if there
>> is a desire to move away from that, then we should be explicit about it.
>
> I hope that's not literally true in the sense of writing it out to
> disk, that would be incredibly stupid. Surely you mean that it was
> meant to have paths *like* a file system, which it does. It's still an
> inefficient design because there's no locality so every step is a (or
> more likely several because of the hash tables) cache miss.

No, of course it's not in the sense of writing out to disk.  That would
be silly!  :-)

Keep in mind it was designed when GnuCash ONLY had XML, and the "paths"
in a KVP tree directly mapped to the XML object hierarchy.

>>> Most of our existing use of nested KVP isn't necessary anyway, but
>>> changing it will require a new file/db version and conversion
>>> routines. No point in introducing new nesting though, and besides that
>>> would also create a file incompatibility between 2.6 and 2.8.
>> 
>> I would disagree; it's nice to have some nesting (certainly within the
>> XML framework) to make it easy to remove whole KVP subtrees.  When you
>> view KVP as a file system then it makes total sense to have nesting, and
>> all the benefits that come with it.
>
> That would carry more weight if we actually used the XML DOM tree
> inside of GnuCash, but we don't. Besides, that's not how we use
> it. With a couple of exceptions (import matching and book properties)
> we use it to add members to classes without changing the XML or SQL
> Schema.  Those paths are two or three deep and for the most part have
> only one or two elements at the bottom. Getting rid of KvpFrames and
> converting the "path" to a string name so that it takes a single hash
> lookup instead 2 or three will be more performant with no affect on
> our actual usage. The payoff is even higher when the KVP data isn't
> all in memory: Having to do 3 SQL queries to retrieve an int64_t or a
> string is ridiculous. For most of those uses having to do a separate
> query at all is ridiculous. The members should be part of the object
> record and retrieved when the object is instantiated.

There are two separate issues here:

1) The SQL encoding of KVP.  The fact that it's encoded the way it is,
using frames that multiple queries, arguable is itself a bug.  I'm not
sure why we didn't use full-path encodings in SQL.  But that is
completely seprate from how KVPs are used/stored internally.

2) Not loading KVP with the object.  When an object is instantiated it
should, arguably, load all the associated KVP data, too, so that it can
be accessed directly through the object APIs.  Whether it's in core as a
KVP or in core as a table column is mostly irrelevant.

Of course fixing #1 would help with #2.

> Regards,
> John Ralls

-derek

-- 
   Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
   Member, MIT Student Information Processing Board  (SIPB)
   URL: http://web.mit.edu/warlord/PP-ASEL-IA N1NWH
   warl...@mit.eduPGP key available
___
gnucash-devel mailing list
gnucash-devel@gnucash.org
https://lists.gnucash.org/mailman/listinfo/gnucash-devel


Re: KVP and data that contains a forward slash

2016-02-11 Thread John Ralls

> On Feb 11, 2016, at 11:20 AM, Derek Atkins  wrote:
> 
> John,
> 
> John Ralls  writes:
> 
>>> Originally the KVP path was supposed to be a file-system.  So if there
>>> is a desire to move away from that, then we should be explicit about it.
>> 
>> I hope that's not literally true in the sense of writing it out to
>> disk, that would be incredibly stupid. Surely you mean that it was
>> meant to have paths *like* a file system, which it does. It's still an
>> inefficient design because there's no locality so every step is a (or
>> more likely several because of the hash tables) cache miss.
> 
> No, of course it's not in the sense of writing out to disk.  That would
> be silly!  :-)
> 
> Keep in mind it was designed when GnuCash ONLY had XML, and the "paths"
> in a KVP tree directly mapped to the XML object hierarchy.
> 
 Most of our existing use of nested KVP isn't necessary anyway, but
 changing it will require a new file/db version and conversion
 routines. No point in introducing new nesting though, and besides that
 would also create a file incompatibility between 2.6 and 2.8.
>>> 
>>> I would disagree; it's nice to have some nesting (certainly within the
>>> XML framework) to make it easy to remove whole KVP subtrees.  When you
>>> view KVP as a file system then it makes total sense to have nesting, and
>>> all the benefits that come with it.
>> 
>> That would carry more weight if we actually used the XML DOM tree
>> inside of GnuCash, but we don't. Besides, that's not how we use
>> it. With a couple of exceptions (import matching and book properties)
>> we use it to add members to classes without changing the XML or SQL
>> Schema.  Those paths are two or three deep and for the most part have
>> only one or two elements at the bottom. Getting rid of KvpFrames and
>> converting the "path" to a string name so that it takes a single hash
>> lookup instead 2 or three will be more performant with no affect on
>> our actual usage. The payoff is even higher when the KVP data isn't
>> all in memory: Having to do 3 SQL queries to retrieve an int64_t or a
>> string is ridiculous. For most of those uses having to do a separate
>> query at all is ridiculous. The members should be part of the object
>> record and retrieved when the object is instantiated.
> 
> There are two separate issues here:
> 
> 1) The SQL encoding of KVP.  The fact that it's encoded the way it is,
> using frames that multiple queries, arguable is itself a bug.  I'm not
> sure why we didn't use full-path encodings in SQL.  But that is
> completely seprate from how KVPs are used/stored internally.
> 
> 2) Not loading KVP with the object.  When an object is instantiated it
> should, arguably, load all the associated KVP data, too, so that it can
> be accessed directly through the object APIs.  Whether it's in core as a
> KVP or in core as a table column is mostly irrelevant.
> 

Again, the path argument would be more convincing if we actually used XPath. I 
don't think it really matters what grand plan the designer (you?) had in mind, 
what matters is how we're using it now.

No, SQL just makes the penalty more severe. Cache misses are expensive; so much 
so that the current C++ doctrine (based on many simulations) is that it's much 
faster to sequentially search a std::vector than to use any container that 
relies on independently allocated nodes, even when the overhead of complete 
reallocation for an insert operation is accounted for. Every indirection in 
KVP, i.e. every step involving a KvpFrame means dereferencing the KvpFrame*, 
dereferencing its GHashTable*, dereferencing the hash table's hashes and 
getting the next KvpItem*, dereferencing it, getting its content type and ptr, 
and if it's a KvpFrame, repeating the process. Because each of them is a 
different type GSlice will have a different magazine for each of them. If not 
much else is going on on the system all of the magazines might be in cache 
after the first round and subsequent descents will be faster. Or not.

That overhead could be reduced immensely by not having KvpFrames. With the 
whole path as a key there'd be a single GHashtable lookup and only three 
derefences (the QofInstance's Kvp GHashtable, the KvpItem* it returns, and the 
KvpItem's contents).

All of that said, I agree that not implementing KVP in the SQL backend that way 
was a mistake and I regret it.

Moving almost all of the Kvp access to being through GObject properties does 
make all of the access via the object's API. The next step is to add members to 
the objects and load them from KVP in the backend. That gets rid of the Kvp 
performance penalty entirely for the XML backend and makes it one-time for the 
SQL backend. 

Regards,
John Ralls


___
gnucash-devel mailing list
gnucash-devel@gnucash.org
https://lists.gnucash.org/mailman/listinfo/gnucash-devel


Re: Tax Sched report duplicates accounts

2016-02-11 Thread David T.
Ted,

I see some of the oddities with the date controls as well. I will note that I 
do not observe any duplication in the report, and neither do I get two years’ 
data incorporated. 

What I see with the dates is that the two settings are applied separately. So 
if you have the From and To entries set to Previous Year and Alternate Period 
to Last Year, it will give you two years ago (that is, the Last Year from the 
Previous Year). If you have both date controls in use, some other limits appear 
to apply as well. For example, a "Last Year" Alternate Period setting limits 
the report to a full year report, regardless of the From - To settings. That 
is, if you use Previous Quarter in From and To, it doesn’t change the report. 
If you want a report for a shorter or variant time period, then you can set the 
Alternate period to Use From - To. 

Reset defaults works for me. Can you explain what happens when you click it?

David


> On Feb 11, 2016, at 1:40 PM, Ted Creedon  wrote:
> 
> For year 2015
> 
> 
> There is an apparent conflict between options>general>>from >>to buttons and 
> alternate period drop down box
> 
> 
> It works fine if alternate period>>is not "last  year" but is set to >>"use 
> from to" if left as the default "last year" ; the "last year" setting creates 
> the problem of duplication and incorrectly adds in 2014 to 2015
> 
> 
> I.e. the reset defaults box/button doesn't work properly either..
> 
> 
> I hope this helps
> 
> 
> email if more info needed
> 
> 
> tedc
> 
> 
> 
> From: Alex Aycinena 
> Sent: Tuesday, February 9, 2016 1:53 PM
> To: gnucash-devel; Ted Creedon
> Subject: Re: Tax Sched report duplicates accounts
> 
> 
> -- Forwarded message --
> From: Ted Creedon >
> To: "gnucash-devel@gnucash.org" 
> >
> Cc:
> Date: Mon, 8 Feb 2016 19:12:54 +
> Subject: Tax Sched reoprt duplicates accounts
> Depending on the General tab settings the report duplicates accounts 2x or 3x
> gc ver 2.6.11
> 
> on 86_64 x86_64 x86_64 GNU/Linux
> 
> tedc
> 
> 
> Can you provide more information? I would like to reproduce the problem. What 
> does "depending on the General tab settings ..." mean? With some settings it 
> does but with others it doesn't? You're referring to the General tab in the 
> report options? How do I have to set it up to get the same problem?
> 
> Alex
> ___
> gnucash-devel mailing list
> gnucash-devel@gnucash.org
> https://lists.gnucash.org/mailman/listinfo/gnucash-devel


___
gnucash-devel mailing list
gnucash-devel@gnucash.org
https://lists.gnucash.org/mailman/listinfo/gnucash-devel


Re: Tax Sched report duplicates accounts

2016-02-11 Thread Ted Creedon
For year 2015


There is an apparent conflict between options>general>>from >>to buttons and 
alternate period drop down box


It works fine if alternate period>>is not "last  year" but is set to >>"use 
from to" if left as the default "last year" ; the "last year" setting creates 
the problem of duplication and incorrectly adds in 2014 to 2015


I.e. the reset defaults box/button doesn't work properly either..


I hope this helps


email if more info needed


tedc



From: Alex Aycinena 
Sent: Tuesday, February 9, 2016 1:53 PM
To: gnucash-devel; Ted Creedon
Subject: Re: Tax Sched report duplicates accounts


-- Forwarded message --
From: Ted Creedon >
To: "gnucash-devel@gnucash.org" 
>
Cc:
Date: Mon, 8 Feb 2016 19:12:54 +
Subject: Tax Sched reoprt duplicates accounts
Depending on the General tab settings the report duplicates accounts 2x or 3x
 gc ver 2.6.11

on 86_64 x86_64 x86_64 GNU/Linux

tedc


Can you provide more information? I would like to reproduce the problem. What 
does "depending on the General tab settings ..." mean? With some settings it 
does but with others it doesn't? You're referring to the General tab in the 
report options? How do I have to set it up to get the same problem?

Alex
___
gnucash-devel mailing list
gnucash-devel@gnucash.org
https://lists.gnucash.org/mailman/listinfo/gnucash-devel