[jira] [Updated] (PIG-2614) AvroStorage crashes on LOADING a single bad error

2012-12-25 Thread Cheolsoo Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheolsoo Park updated PIG-2614:
---

Resolution: Duplicate
Status: Resolved  (was: Patch Available)

Closing it as PIG-3059 incorporates this.

> AvroStorage crashes on LOADING a single bad error
> -
>
> Key: PIG-2614
> URL: https://issues.apache.org/jira/browse/PIG-2614
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.10.0, 0.11
>Reporter: Russell Jurney
>Assignee: Jonathan Coveney
>  Labels: avro, avrostorage, bad, book, cutting, doug, for, my, 
> pig, sadism
> Fix For: 0.12
>
> Attachments: PIG-2614_0.patch, PIG-2614_1.patch, PIG-2614_2.patch, 
> test_avro_files.tar.gz
>
>
> AvroStorage dies when a single bad record exists, such as one with missing 
> fields.  This is very bad on 'big data,' where bad records are inevitable.  
> See discussion at 
> http://www.quora.com/Big-Data/In-Big-Data-ETL-how-many-records-are-an-acceptable-loss
>  for more theory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2614) AvroStorage crashes on LOADING a single bad error

2012-12-18 Thread Julien Le Dem (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Le Dem updated PIG-2614:
---

Fix Version/s: (was: 0.10.1)
   (was: 0.11)
   0.12

moving this to next release so that we can converge on pig 0.11

> AvroStorage crashes on LOADING a single bad error
> -
>
> Key: PIG-2614
> URL: https://issues.apache.org/jira/browse/PIG-2614
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.10.0, 0.11
>Reporter: Russell Jurney
>Assignee: Jonathan Coveney
>  Labels: avro, avrostorage, bad, book, cutting, doug, for, my, 
> pig, sadism
> Fix For: 0.12
>
> Attachments: PIG-2614_0.patch, PIG-2614_1.patch, PIG-2614_2.patch, 
> test_avro_files.tar.gz
>
>
> AvroStorage dies when a single bad record exists, such as one with missing 
> fields.  This is very bad on 'big data,' where bad records are inevitable.  
> See discussion at 
> http://www.quora.com/Big-Data/In-Big-Data-ETL-how-many-records-are-an-acceptable-loss
>  for more theory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2614) AvroStorage crashes on LOADING a single bad error

2012-11-30 Thread Cheolsoo Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheolsoo Park updated PIG-2614:
---

Status: Patch Available  (was: Open)

> AvroStorage crashes on LOADING a single bad error
> -
>
> Key: PIG-2614
> URL: https://issues.apache.org/jira/browse/PIG-2614
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.10.0, 0.11
>Reporter: Russell Jurney
>Assignee: Jonathan Coveney
>  Labels: avro, avrostorage, bad, book, cutting, doug, for, my, 
> pig, sadism
> Fix For: 0.11, 0.10.1
>
> Attachments: PIG-2614_0.patch, PIG-2614_1.patch, PIG-2614_2.patch, 
> test_avro_files.tar.gz
>
>
> AvroStorage dies when a single bad record exists, such as one with missing 
> fields.  This is very bad on 'big data,' where bad records are inevitable.  
> See discussion at 
> http://www.quora.com/Big-Data/In-Big-Data-ETL-how-many-records-are-an-acceptable-loss
>  for more theory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2614) AvroStorage crashes on LOADING a single bad error

2012-11-30 Thread Cheolsoo Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheolsoo Park updated PIG-2614:
---

Attachment: test_avro_files.tar.gz
PIG-2614_2.patch

Hi all,

I rebased the patch to trunk. Hopefully, this will make things more clear:
- Removed PIG-2551 code since it's already committed to trunk.
- Replaced the {{ignore_bad_file}} option that was committed in PIG-2909 with 
the {{bad.record.threshold}} and {{bad.record.min}} properties.
- Added unit test cases
{{testCorruptedFile1,2,3}}.

@Joe,
I am not sure if I fully understand your question. Please correct me if I am 
wrong.

You're right that {{InputErrorTracker}} can be used by any LoadFunc. What 
storages need to do is to create a {{InputErrorTracker}} and increase counters. 
Do you have a better suggestion?

Thanks!

> AvroStorage crashes on LOADING a single bad error
> -
>
> Key: PIG-2614
> URL: https://issues.apache.org/jira/browse/PIG-2614
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.10.0, 0.11
>Reporter: Russell Jurney
>Assignee: Jonathan Coveney
>  Labels: avro, avrostorage, bad, book, cutting, doug, for, my, 
> pig, sadism
> Fix For: 0.11, 0.10.1
>
> Attachments: PIG-2614_0.patch, PIG-2614_1.patch, PIG-2614_2.patch, 
> test_avro_files.tar.gz
>
>
> AvroStorage dies when a single bad record exists, such as one with missing 
> fields.  This is very bad on 'big data,' where bad records are inevitable.  
> See discussion at 
> http://www.quora.com/Big-Data/In-Big-Data-ETL-how-many-records-are-an-acceptable-loss
>  for more theory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2614) AvroStorage crashes on LOADING a single bad error

2012-04-26 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-2614:


Fix Version/s: (was: 0.10.0)
   0.10.1

> AvroStorage crashes on LOADING a single bad error
> -
>
> Key: PIG-2614
> URL: https://issues.apache.org/jira/browse/PIG-2614
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.10.0, 0.11
>Reporter: Russell Jurney
>  Labels: avro, avrostorage, bad, book, cutting, doug, for, my, 
> pig, sadism
> Fix For: 0.11, 0.10.1
>
> Attachments: PIG-2614_0.patch, PIG-2614_1.patch
>
>
> AvroStorage dies when a single bad record exists, such as one with missing 
> fields.  This is very bad on 'big data,' where bad records are inevitable.  
> See discussion at 
> http://www.quora.com/Big-Data/In-Big-Data-ETL-how-many-records-are-an-acceptable-loss
>  for more theory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (PIG-2614) AvroStorage crashes on LOADING a single bad error

2012-03-27 Thread Jonathan Coveney (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Coveney updated PIG-2614:
--

Attachment: PIG-2614_1.patch

Russell,

This patch gets rid of the logging dependency. Let me know if you have any 
issues with it!

> AvroStorage crashes on LOADING a single bad error
> -
>
> Key: PIG-2614
> URL: https://issues.apache.org/jira/browse/PIG-2614
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.10, 0.11
>Reporter: Russell Jurney
>  Labels: avro, avrostorage, bad, book, cutting, doug, for, my, 
> pig, sadism
> Fix For: 0.10, 0.11
>
> Attachments: PIG-2614_0.patch, PIG-2614_1.patch
>
>
> AvroStorage dies when a single bad record exists, such as one with missing 
> fields.  This is very bad on 'big data,' where bad records are inevitable.  
> See discussion at 
> http://www.quora.com/Big-Data/In-Big-Data-ETL-how-many-records-are-an-acceptable-loss
>  for more theory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (PIG-2614) AvroStorage crashes on LOADING a single bad error

2012-03-26 Thread Jonathan Coveney (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Coveney updated PIG-2614:
--

Attachment: PIG-2614_0.patch

works in both 0.10 and 0.11

> AvroStorage crashes on LOADING a single bad error
> -
>
> Key: PIG-2614
> URL: https://issues.apache.org/jira/browse/PIG-2614
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.10, 0.11
>Reporter: Russell Jurney
>Priority: Blocker
>  Labels: avro, avrostorage, bad, book, cutting, doug, for, my, 
> pig, sadism
> Fix For: 0.10, 0.11
>
> Attachments: PIG-2614_0.patch
>
>
> AvroStorage dies when a single bad record exists, such as one with missing 
> fields.  This is very bad on 'big data,' where bad records are inevitable.  
> See discussion at 
> http://www.quora.com/Big-Data/In-Big-Data-ETL-how-many-records-are-an-acceptable-loss
>  for more theory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (PIG-2614) AvroStorage crashes on LOADING a single bad error

2012-03-26 Thread Jonathan Coveney (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Coveney updated PIG-2614:
--

 Priority: Major  (was: Blocker)
Affects Version/s: 0.11
Fix Version/s: 0.11

> AvroStorage crashes on LOADING a single bad error
> -
>
> Key: PIG-2614
> URL: https://issues.apache.org/jira/browse/PIG-2614
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.10, 0.11
>Reporter: Russell Jurney
>  Labels: avro, avrostorage, bad, book, cutting, doug, for, my, 
> pig, sadism
> Fix For: 0.10, 0.11
>
> Attachments: PIG-2614_0.patch
>
>
> AvroStorage dies when a single bad record exists, such as one with missing 
> fields.  This is very bad on 'big data,' where bad records are inevitable.  
> See discussion at 
> http://www.quora.com/Big-Data/In-Big-Data-ETL-how-many-records-are-an-acceptable-loss
>  for more theory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira