Hi

I can see that exception is caused by following, csn you check where in
your code you are using this path

Caused by: org.apache.hadoop.mapred.InvalidInputException: Input path does
not exist:
hdfs://testcluster:8020/experiments/vol/spark_chomp_data/bak/restaurants-bak/latest

On Wed, 17 Aug 2016, 10:57 p.m. max square, <max2subscr...@gmail.com> wrote:

> /bump
>
> It'd be great if someone can point me to the correct direction.
>
> On Mon, Aug 8, 2016 at 5:07 PM, max square <max2subscr...@gmail.com>
> wrote:
>
>> Here's the complete stacktrace -
>> https://gist.github.com/rohann/649b0fcc9d5062ef792eddebf5a315c1
>>
>> For reference, here's the complete function -
>>
>>   def saveAsLatest(df: DataFrame, fileSystem: FileSystem, bakDir:
>> String) = {
>>
>>     fileSystem.rename(new Path(bakDir + latest), new Path(bakDir + "/" +
>> ScalaUtil.currentDateTimeString))
>>
>>     df.write.format("com.databricks.spark.csv").options(Map("mode" ->
>> "DROPMALFORMED", "delimiter" -> "\t", "header" -> "true")).save(bakDir +
>> latest)
>>
>>   }
>>
>> On Mon, Aug 8, 2016 at 3:41 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>>
>>> Mind showing the complete stack trace ?
>>>
>>> Thanks
>>>
>>> On Mon, Aug 8, 2016 at 12:30 PM, max square <max2subscr...@gmail.com>
>>> wrote:
>>>
>>>> Thanks Ted for the prompt reply.
>>>>
>>>> There are three or four DFs that are coming from various sources and
>>>> I'm doing a unionAll on them.
>>>>
>>>> val placesProcessed = placesUnchanged.unionAll(
>>>> placesAddedWithMerchantId).unionAll(
>>>> placesUpdatedFromHotelsWithMerchantId).unionAll(
>>>> placesUpdatedFromRestaurantsWithMerchantId).unionAll(placesChanged)
>>>>
>>>> I'm using Spark 1.6.2.
>>>>
>>>> On Mon, Aug 8, 2016 at 3:11 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>>>>
>>>>> Can you show the code snippet for unionAll operation ?
>>>>>
>>>>> Which Spark release do you use ?
>>>>>
>>>>> BTW please use user@spark.apache.org in the future.
>>>>>
>>>>> On Mon, Aug 8, 2016 at 11:47 AM, max square <max2subscr...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hey guys,
>>>>>>
>>>>>> I'm trying to save Dataframe in CSV format after performing unionAll
>>>>>> operations on it.
>>>>>> But I get this exception -
>>>>>>
>>>>>> Exception in thread "main"
>>>>>> org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute,
>>>>>> tree:
>>>>>> TungstenExchange hashpartitioning(mId#430,200)
>>>>>>
>>>>>> I'm saving it by
>>>>>>
>>>>>> df.write.format("com.databricks.spark.csv").options(Map("mode" ->
>>>>>> "DROPMALFORMED", "delimiter" -> "\t", "header" -> "true")).save(bakDir
>>>>>> + latest)
>>>>>>
>>>>>> It works perfectly if I don't do the unionAll operation.
>>>>>> I see that the format isn't different by printing the part of the
>>>>>> results.
>>>>>>
>>>>>> Any help regarding this would be appreciated.
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to