Re: [Neo4j] Need help with simple query to sort & filter duplicates

2018-04-03 Thread 'Michael Hunger' via Neo4j
Hi,


You could do two things:

1. Mark the last run with  a dedicated label, that you update if you insert
new data.
And use that label for querying for the last one.

e.g.

OPTIONAL MATCH (n:HP:Last) WHERE n.id = $id  REMOVE n:Last
CREATE (n:HP:Last {id:$id, })


MATCH (a:HP:Last) WHERE a.project='CHP' and a.type='test-instance' and
a.`cycle-id` ='11743' RETURN distinct a.id, a.`test-id`, a.`exec-date`,
a.`last-modified`, a.status, a.name, a.owner order by a.id


2. Alternatively you can sort them by id and date


MATCH (a:HP) WHERE a.project='CHP' and a.type='test-instance' and
a.`cycle-id` ='11743'
WITH a ORDER BY a.id ASC, a.`exec-date` DESC
WITH a.id, head(collect(a)) as last
RETURN distinct last.id , last.`test-id`, last.`exec-date`,
last.`last-modified`, last.status, last.name , last.owner
order by last.id 

HTH Michael

On Tue, Apr 3, 2018 at 8:04 AM,  wrote:

> I am new to Neo4j Cypher queries. I am getting HPALM data in Neo4j & then
> rendering it on grafana. HPALM creates a new row with every test run under
> the same "test-id". Due to this the summary test status shows wrong count
> due to duplicate entries with one row showing as status=Passed (latest
> run) & another as status=Failed (older run). But I need only the latest row
> data, the one with status=Passed in this example. below is the query which
> shows sample data having duplicate enrties/ rows but with different test
> 'status' & different 'last-modified' time. I need query to extract the
> unique / distinct latest entries & filter out older duplicate entries/
> rows.
> you can see first 2 rows with same id=44079, test-id=72534 but different
> status Failed & Passed & different 'last-modified' date time.
>
> MATCH (a:HP) WHERE a.project='CHP' and a.type='test-instance' and
> a.`cycle-id` ='11743' RETURN distinct a.id, a.`test-id`, a.`exec-date`,
> a.`last-modified`, a.status, a.name, a.owner order by a.id
>
> ╒═╤═══╤═╤═══╤═╤══╤═══╕
> │a.id │a.`test-id`│a.`exec-date`│a.`last-modified`  │a.status 
> │a.name│a.owner│
> ╞═╪═══╪═╪═══╪═╪══╪═══╡
> │44079│72534  │2018-03-09   │2018-03-09 07:42:30│Failed   
> │(null)│g942477│
> ├─┼───┼─┼───┼─┼──┼───┤
> │44079│72534  │2018-03-20   │2018-03-20 13:01:31│Passed   
> │(null)│g942477│
> ├─┼───┼─┼───┼─┼──┼───┤
> │44080│72533  │2018-03-07   │2018-03-07 11:12:54│Failed   
> │(null)│g942477│
> ├─┼───┼─┼───┼─┼──┼───┤
> │44080│72533  │2018-03-21   │2018-03-21 10:42:11│Passed   
> │(null)│g942477│
> ├─┼───┼─┼───┼─┼──┼───┤
> │44081│72532  │2018-03-05   │2018-03-05 11:51:28│Passed   
> │(null)│g942477│
> ├─┼───┼─┼───┼─┼──┼───┤
> │44082│72526  │2018-03-05   │2018-03-05 11:56:54│Failed   
> │(null)│g942477│
> ├─┼───┼─┼───┼─┼──┼───┤
> │44082│72526  │2018-03-21   │2018-03-21 10:57:40│Not 
> Completed│(null)│g942477│
>
>
> Additionally if this is resolved I need to inject the same logic in below 
> summary query to filter out the duplicates:
>
> Currently the real count of Failed, No Run & Not Completed shows higher than 
> expected due adding up of duplicates.
>
> MATCH (a:HP)
> WHERE  a.project='CHP'  and a.type='test-instance' and a.`cycle-id`='11743'
> RETURN  a.status , count(*) as cnt, a.`cycle-id` order by a.status
>
> ╒═╤═══╤╕
> │a.status │cnt│a.`cycle-id`│
> ╞═╪═══╪╡
> │Failed   │4  │11743   │
> ├─┼───┼┤
> │No Run   │2  │11743   │
> ├─┼───┼┤
> │Not Completed│5  │11743   │
> ├─┼───┼┤
> │Passed   │31 │11743   │
> └─┴───┴┘
>
> Any help to resolve this is most appreciated. I need help forming query with 
> "Group by" & "having" clause here with example. I also need help creating 
> query for sorting by date time.
>
> Cannot find correct functions for date time addition / subtraction & sorting 
> as the field type is "string". Let me know if you need more information.
>
> I am using version 3.0.6 of Neo4j
>
>- Version:
>3.0.6
>- Name:
>graph.db
>
> --
> You received this message because you are subscribed to the Google Groups
> "Neo4j" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to neo4j+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To 

[Neo4j] Need help with simple query to sort & filter duplicates

2018-04-03 Thread meghalv
I am new to Neo4j Cypher queries. I am getting HPALM data in Neo4j & then 
rendering it on grafana. HPALM creates a new row with every test run under 
the same "test-id". Due to this the summary test status shows wrong count 
due to duplicate entries with one row showing as status=Passed (latest 
run) & another as status=Failed (older run). But I need only the latest row 
data, the one with status=Passed in this example. below is the query which 
shows sample data having duplicate enrties/ rows but with different test 
'status' & different 'last-modified' time. I need query to extract the 
unique / distinct latest entries & filter out older duplicate entries/ 
rows. 
you can see first 2 rows with same id=44079, test-id=72534 but different 
status Failed & Passed & different 'last-modified' date time.

MATCH (a:HP) WHERE a.project='CHP' and a.type='test-instance' and 
a.`cycle-id` ='11743' RETURN distinct a.id, a.`test-id`, a.`exec-date`, 
a.`last-modified`, a.status, a.name, a.owner order by a.id

╒═╤═══╤═╤═══╤═╤══╤═══╕
│a.id │a.`test-id`│a.`exec-date`│a.`last-modified`  │a.status 
│a.name│a.owner│
╞═╪═══╪═╪═══╪═╪══╪═══╡
│44079│72534  │2018-03-09   │2018-03-09 07:42:30│Failed   
│(null)│g942477│
├─┼───┼─┼───┼─┼──┼───┤
│44079│72534  │2018-03-20   │2018-03-20 13:01:31│Passed   
│(null)│g942477│
├─┼───┼─┼───┼─┼──┼───┤
│44080│72533  │2018-03-07   │2018-03-07 11:12:54│Failed   
│(null)│g942477│
├─┼───┼─┼───┼─┼──┼───┤
│44080│72533  │2018-03-21   │2018-03-21 10:42:11│Passed   
│(null)│g942477│
├─┼───┼─┼───┼─┼──┼───┤
│44081│72532  │2018-03-05   │2018-03-05 11:51:28│Passed   
│(null)│g942477│
├─┼───┼─┼───┼─┼──┼───┤
│44082│72526  │2018-03-05   │2018-03-05 11:56:54│Failed   
│(null)│g942477│
├─┼───┼─┼───┼─┼──┼───┤
│44082│72526  │2018-03-21   │2018-03-21 10:57:40│Not 
Completed│(null)│g942477│


Additionally if this is resolved I need to inject the same logic in below 
summary query to filter out the duplicates:

Currently the real count of Failed, No Run & Not Completed shows higher than 
expected due adding up of duplicates.

MATCH (a:HP)
WHERE  a.project='CHP'  and a.type='test-instance' and a.`cycle-id`='11743'
RETURN  a.status , count(*) as cnt, a.`cycle-id` order by a.status

╒═╤═══╤╕
│a.status │cnt│a.`cycle-id`│
╞═╪═══╪╡
│Failed   │4  │11743   │
├─┼───┼┤
│No Run   │2  │11743   │
├─┼───┼┤
│Not Completed│5  │11743   │
├─┼───┼┤
│Passed   │31 │11743   │
└─┴───┴┘

Any help to resolve this is most appreciated. I need help forming query with 
"Group by" & "having" clause here with example. I also need help creating query 
for sorting by date time. 

Cannot find correct functions for date time addition / subtraction & sorting as 
the field type is "string". Let me know if you need more information. 

I am using version 3.0.6 of Neo4j
   
   - Version:
   3.0.6
   - Name:
   graph.db
   

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to neo4j+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.