vijayalakshmi karasani created PIG-4617: -------------------------------------------
Summary: XML loader is not working fine with pig 0.14 version Key: PIG-4617 URL: https://issues.apache.org/jira/browse/PIG-4617 Project: Pig Issue Type: Bug Components: piggybank, UI Reporter: vijayalakshmi karasani Priority: Blocker My old pig script (to load xml files and to parse)which ran successfully through pig 0.13 version is not running with pig 0.14 and throwing ava.lang.IndexOutOfBoundsException: start 4, end 2, s.length() 2. Out of my 10 xml files, 2 are running fine and rest 8 are not file..All these xml files ran successfully with pig 0.13 version. May be in new version, you have added more validations for well formed of xml files My Code: REGISTER '/usr/hdp/current/pig-client/lib/piggybank.jar'; C = LOAD '/common/data/dia/stepxml/*' using org.apache.pig.piggybank.storage.XMLLoader('Product') as (x:char array); STORE C into '/common/data/dia/intermediate_xmls/Imn_Unique_both2'; ERROR: Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input Pattern hdfs://d-3zkyk02.target.com:8020/common/data/dia/stepxml/* matches 0 files Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input Pattern hdfs://d-3zkyk02.target.com:8020/common/data/dia/stepxml/* matches 0 files at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:321) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:264) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextInputFormat.listStatus(PigTextInputFormat.java:36) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:385) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:265) ... 18 more /common/data/dia/intermediate_xmls/Imn_Unique_both2, Input(s): Failed to read data from "/common/data/dia/stepxml/*" Output(s): Failed to produce result in "/common/data/dia/intermediate_xmls/Imn_Unique_both2" -- This message was sent by Atlassian JIRA (v6.3.4#6332)