Hi,
I ran into a a ginq "issue" today.
I have two list of rows where each key is the column name (actually
List<Row> but you can think about it as a List<Map>) collections that
looks like this:
employees: 3 obs * 5 variables
mainSsn coSsn firstName salary startDate
111 Rick 623.3 2012-01-01
222 444 Dan 515.2 2013-09-23
333 555 Michelle 611.0 2014-11-15
eln: 5 obs * 2 variables
customerId lei
2 111
3 222
4 333
5 444
6 555
I want to join employees with eln and add customerIs columns matching
mainSsn and CoSsn. Thinking SQL, I wanted to do something like this
def result =GQ {
from tin employees
leftjoin mcidin elnon t.mainSsn == mcid.lei leftjoin cocidin elnon t.coSsn ==
cocid.lei select t.*, mcid?.customerId as'mainCustomerId', cocid?.customerId
as'coCustomerId' }
but this is not supported in ginq and I could not find any signs of support for
wildcards in the docs.
I found that could do this:
def result = GQ {
from t in table
leftjoin mcid in ssnCustomerId on t.mainSsn == mcid.lei
leftjoin cocid in ssnCustomerId on t.coSsn == cocid.lei
select t + mcid?.customerId+ cocid?.customerId
}
Which gives me a List of Lists but then i loose the column names. i.e.
merged: 3 obs * 7 variables
c1 c2 c3 c4 c5 c6 c7
111 444 Rick 623.3 2012-01-01 2 5
222 Dan 515.2 2013-09-23 3 null
333 555 Michelle 611.0 2014-11-15 4 6
Instead i had to do this:
def result =GQ {
from tin employees
leftjoin mcidin elnon t.mainSsn == mcid.lei leftjoin cocidin elnon t.coSsn ==
cocid.lei select t.toMap() + [mainCustomerId: mcid?.customerId] +
[coCustomerId: cocid?.customerId]
}
(This relies on the fact that a matrix Row has a toMap() method)
ginq result content:
[{mainSsn=111, coSsn=, firstName=Rick, salary=623.3,
startDate=2012-01-01, mainCustomerId=2, coCustomerId=null},
{mainSsn=222, coSsn=444, firstName=Dan, salary=515.2,
startDate=2013-09-23, mainCustomerId=3, coCustomerId=5}, {mainSsn=333,
coSsn=555, firstName=Michelle, salary=611.0, startDate=2014-11-15,
mainCustomerId=4, coCustomerId=6}]
Which is could then easily transform back into a list of rows (actually
a matrix but you can think of it as a List<Map>
result matrix content:
merged: 3 obs * 7 variables
mainSsn coSsn firstName salary startDate mainCustomerId
coCustomerId
111 Rick 623.3
2012-01-01 2 null
222 444 Dan 515.2
2013-09-23 3 5
333 555 Michelle 611.0 2014-11-15
4 6
Has there been any discussions about supporting breaking up the select
objects into parts using wildcards and deemed it not feasible to
implement or did nobody have this problem before?
Something like this could perhaps be an alternative to wildcard syntax:
def result =GQ {
from tin employees
leftjoin mcidin elnon t.mainSsn == mcid.lei leftjoin cocidin elnon t.coSsn ==
cocid.lei select toMap(t) + [mainCustomerId: mcid?.customerId,coCustomerId:
cocid?.customerId])
}
what do you think?
Best regards,
Per