[jira] [Updated] (HIVE-12664) Bug in reduce deduplication optimization causing ArrayOutOfBoundException

2016-01-04 Thread Johan Gustavsson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johan Gustavsson updated HIVE-12664:

Attachment: HIVE-12664.2.patch

> Bug in reduce deduplication optimization causing ArrayOutOfBoundException
> -
>
> Key: HIVE-12664
> URL: https://issues.apache.org/jira/browse/HIVE-12664
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.1, 1.2.1
>Reporter: Johan Gustavsson
>Assignee: Johan Gustavsson
> Attachments: HIVE-12664-1.patch, HIVE-12664-2.patch, 
> HIVE-12664.1.patch, HIVE-12664.2.patch, HIVE-12664.patch
>
>
> The optimisation check for reduce deduplication only checks the first child 
> node for join -and the check itself also contains a major bug- causing 
> ArrayOutOfBoundException no matter what.
> Sample data table form:
> ||time||user||host||path||referer||code||agent||size||method||
> |int|string|string|string|string|bigint|string|bigint|string|
> Sample query
> {code:sql}
> SELECT 
>   t1.host,
>   COUNT(DISTINCT t1.`date`) AS login_count,
>   MAX(t2.code) AS code,
>   unix_timestamp() AS time
> FROM (
> SELECT 
>   HOST,
>   MIN(time) AS DATE
> FROM
>   www_access
> WHERE
>   HOST IS NOT NULL
> GROUP BY
>   HOST
>   ) t1
> JOIN (
> SELECT 
>   HOST,
>   MIN(time) AS code
> FROM
>   www_access
> WHERE
>   HOST IS NOT NULL
> GROUP BY
>   HOST
>   ) t2
>   ON t1.host = t2.host
> GROUP BY
>   t1.host
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12664) Bug in reduce deduplication optimization causing ArrayOutOfBoundException

2016-01-04 Thread Johan Gustavsson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johan Gustavsson updated HIVE-12664:

Attachment: HIVE-12664-2.patch

> Bug in reduce deduplication optimization causing ArrayOutOfBoundException
> -
>
> Key: HIVE-12664
> URL: https://issues.apache.org/jira/browse/HIVE-12664
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.1, 1.2.1
>Reporter: Johan Gustavsson
>Assignee: Johan Gustavsson
> Attachments: HIVE-12664-1.patch, HIVE-12664-2.patch, 
> HIVE-12664.1.patch, HIVE-12664.patch
>
>
> The optimisation check for reduce deduplication only checks the first child 
> node for join -and the check itself also contains a major bug- causing 
> ArrayOutOfBoundException no matter what.
> Sample data table form:
> ||time||user||host||path||referer||code||agent||size||method||
> |int|string|string|string|string|bigint|string|bigint|string|
> Sample query
> {code:sql}
> SELECT 
>   t1.host,
>   COUNT(DISTINCT t1.`date`) AS login_count,
>   MAX(t2.code) AS code,
>   unix_timestamp() AS time
> FROM (
> SELECT 
>   HOST,
>   MIN(time) AS DATE
> FROM
>   www_access
> WHERE
>   HOST IS NOT NULL
> GROUP BY
>   HOST
>   ) t1
> JOIN (
> SELECT 
>   HOST,
>   MIN(time) AS code
> FROM
>   www_access
> WHERE
>   HOST IS NOT NULL
> GROUP BY
>   HOST
>   ) t2
>   ON t1.host = t2.host
> GROUP BY
>   t1.host
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12664) Bug in reduce deduplication optimization causing ArrayOutOfBoundException

2015-12-21 Thread Johan Gustavsson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johan Gustavsson updated HIVE-12664:

Description: 
The optimisation check for reduce deduplication only checks the first child 
node for join -and the check itself also contains a major bug- causing 
ArrayOutOfBoundException no matter what.

Sample data table form:
||time||user||host||path||referer||code||agent||size||method||
|int|string|string|string|string|bigint|string|bigint|string|

Sample query
{code:sql}
SELECT 
  t1.host,
  COUNT(DISTINCT t1.`date`) AS login_count,
  MAX(t2.code) AS code,
  unix_timestamp() AS time
FROM (
SELECT 
  HOST,
  MIN(time) AS DATE
FROM
  www_access
WHERE
  HOST IS NOT NULL
GROUP BY
  HOST
  ) t1
JOIN (
SELECT 
  HOST,
  MIN(time) AS code
FROM
  www_access
WHERE
  HOST IS NOT NULL
GROUP BY
  HOST
  ) t2
  ON t1.host = t2.host
GROUP BY
  t1.host
{code}

  was:
The optimisation check for reduce deduplication only checks the first child 
node for join -and the check itself also contains a major bug- causing 
ArrayOutOfBoundException no matter what.

Sample data table form:
time||user||host||path||referer||code||agent||size||method
int|string|string|string|string|bigint|string|bigint|string

Sample query
{code:sql}
SELECT 
  t1.host,
  COUNT(DISTINCT t1.`date`) AS login_count,
  MAX(t2.code) AS code,
  unix_timestamp() AS time
FROM (
SELECT 
  HOST,
  MIN(time) AS DATE
FROM
  www_access
WHERE
  HOST IS NOT NULL
GROUP BY
  HOST
  ) t1
JOIN (
SELECT 
  HOST,
  MIN(time) AS code
FROM
  www_access
WHERE
  HOST IS NOT NULL
GROUP BY
  HOST
  ) t2
  ON t1.host = t2.host
GROUP BY
  t1.host
{code}


> Bug in reduce deduplication optimization causing ArrayOutOfBoundException
> -
>
> Key: HIVE-12664
> URL: https://issues.apache.org/jira/browse/HIVE-12664
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.1, 1.2.1
>Reporter: Johan Gustavsson
>Assignee: Johan Gustavsson
> Attachments: HIVE-12664-1.patch, HIVE-12664.1.patch, HIVE-12664.patch
>
>
> The optimisation check for reduce deduplication only checks the first child 
> node for join -and the check itself also contains a major bug- causing 
> ArrayOutOfBoundException no matter what.
> Sample data table form:
> ||time||user||host||path||referer||code||agent||size||method||
> |int|string|string|string|string|bigint|string|bigint|string|
> Sample query
> {code:sql}
> SELECT 
>   t1.host,
>   COUNT(DISTINCT t1.`date`) AS login_count,
>   MAX(t2.code) AS code,
>   unix_timestamp() AS time
> FROM (
> SELECT 
>   HOST,
>   MIN(time) AS DATE
> FROM
>   www_access
> WHERE
>   HOST IS NOT NULL
> GROUP BY
>   HOST
>   ) t1
> JOIN (
> SELECT 
>   HOST,
>   MIN(time) AS code
> FROM
>   www_access
> WHERE
>   HOST IS NOT NULL
> GROUP BY
>   HOST
>   ) t2
>   ON t1.host = t2.host
> GROUP BY
>   t1.host
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12664) Bug in reduce deduplication optimization causing ArrayOutOfBoundException

2015-12-21 Thread Johan Gustavsson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johan Gustavsson updated HIVE-12664:

Description: 
The optimisation check for reduce deduplication only checks the first child 
node for join -and the check itself also contains a major bug- causing 
ArrayOutOfBoundException no matter what.

Sample data table form:
time||user||host||path||referer||code||agent||size||method
int|string|string|string|string|bigint|string|bigint|string

Sample query
{code:sql}
SELECT 
  t1.host,
  COUNT(DISTINCT t1.`date`) AS login_count,
  MAX(t2.code) AS code,
  unix_timestamp() AS time
FROM (
SELECT 
  HOST,
  MIN(time) AS DATE
FROM
  www_access
WHERE
  HOST IS NOT NULL
GROUP BY
  HOST
  ) t1
JOIN (
SELECT 
  HOST,
  MIN(time) AS code
FROM
  www_access
WHERE
  HOST IS NOT NULL
GROUP BY
  HOST
  ) t2
  ON t1.host = t2.host
GROUP BY
  t1.host
{code}

  was:The optimisation check for reduce deduplication only checks the first 
child node for join -and the check itself also contains a major bug- causing 
ArrayOutOfBoundException no matter what.


> Bug in reduce deduplication optimization causing ArrayOutOfBoundException
> -
>
> Key: HIVE-12664
> URL: https://issues.apache.org/jira/browse/HIVE-12664
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.1, 1.2.1
>Reporter: Johan Gustavsson
>Assignee: Johan Gustavsson
> Attachments: HIVE-12664-1.patch, HIVE-12664.1.patch, HIVE-12664.patch
>
>
> The optimisation check for reduce deduplication only checks the first child 
> node for join -and the check itself also contains a major bug- causing 
> ArrayOutOfBoundException no matter what.
> Sample data table form:
> time||user||host||path||referer||code||agent||size||method
> int|string|string|string|string|bigint|string|bigint|string
> Sample query
> {code:sql}
> SELECT 
>   t1.host,
>   COUNT(DISTINCT t1.`date`) AS login_count,
>   MAX(t2.code) AS code,
>   unix_timestamp() AS time
> FROM (
> SELECT 
>   HOST,
>   MIN(time) AS DATE
> FROM
>   www_access
> WHERE
>   HOST IS NOT NULL
> GROUP BY
>   HOST
>   ) t1
> JOIN (
> SELECT 
>   HOST,
>   MIN(time) AS code
> FROM
>   www_access
> WHERE
>   HOST IS NOT NULL
> GROUP BY
>   HOST
>   ) t2
>   ON t1.host = t2.host
> GROUP BY
>   t1.host
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12664) Bug in reduce deduplication optimization causing ArrayOutOfBoundException

2015-12-21 Thread Johan Gustavsson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johan Gustavsson updated HIVE-12664:

Attachment: HIVE-12664.1.patch

> Bug in reduce deduplication optimization causing ArrayOutOfBoundException
> -
>
> Key: HIVE-12664
> URL: https://issues.apache.org/jira/browse/HIVE-12664
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.1, 1.2.1
>Reporter: Johan Gustavsson
>Assignee: Johan Gustavsson
> Attachments: HIVE-12664-1.patch, HIVE-12664.1.patch, HIVE-12664.patch
>
>
> The optimisation check for reduce deduplication only checks the first child 
> node for join -and the check itself also contains a major bug- causing 
> ArrayOutOfBoundException no matter what.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12664) Bug in reduce deduplication optimization causing ArrayOutOfBoundException

2015-12-14 Thread Johan Gustavsson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johan Gustavsson updated HIVE-12664:

Attachment: HIVE-12664-1.patch

Original patch wasn't against trunk... sorry about that.


> Bug in reduce deduplication optimization causing ArrayOutOfBoundException
> -
>
> Key: HIVE-12664
> URL: https://issues.apache.org/jira/browse/HIVE-12664
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.1, 1.2.1
>Reporter: Johan Gustavsson
>Assignee: Johan Gustavsson
> Attachments: HIVE-12664-1.patch, HIVE-12664.patch
>
>
> The optimisation check for reduce deduplication only checks the first child 
> node for join -and the check itself also contains a major bug- causing 
> ArrayOutOfBoundException no matter what.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12664) Bug in reduce deduplication optimization causing ArrayOutOfBoundException

2015-12-14 Thread Johan Gustavsson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johan Gustavsson updated HIVE-12664:

Description: The optimisation check for reduce deduplication only checks 
the first child node for join -and the check itself also contains a major bug- 
causing ArrayOutOfBoundException no matter what.  (was: The optimisation check 
for reduce deduplication only checks the first child node for join and the 
check itself also contains a major bug causing ArrayOutOfBoundException no 
matter what.)

> Bug in reduce deduplication optimization causing ArrayOutOfBoundException
> -
>
> Key: HIVE-12664
> URL: https://issues.apache.org/jira/browse/HIVE-12664
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.1, 1.2.1
>Reporter: Johan Gustavsson
>Assignee: Johan Gustavsson
> Attachments: HIVE-12664.patch
>
>
> The optimisation check for reduce deduplication only checks the first child 
> node for join -and the check itself also contains a major bug- causing 
> ArrayOutOfBoundException no matter what.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12664) Bug in reduce deduplication optimization causing ArrayOutOfBoundException

2015-12-13 Thread Johan Gustavsson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johan Gustavsson updated HIVE-12664:

Attachment: HIVE-12664.patch

> Bug in reduce deduplication optimization causing ArrayOutOfBoundException
> -
>
> Key: HIVE-12664
> URL: https://issues.apache.org/jira/browse/HIVE-12664
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.1, 1.2.1
>Reporter: Johan Gustavsson
>Assignee: Johan Gustavsson
> Attachments: HIVE-12664.patch
>
>
> The optimisation check for reduce deduplication only checks the first child 
> node for join and the check itself also contains a major bug causing 
> ArrayOutOfBoundException no matter what.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)