internals/monitoring_rest_api.html#details-of-a-running-or-completed-job
>
> However, not all of this data is going over the network because some tasks
> can be locally connected.
>
> Best, Fabian
>
> 2016-01-29 8:50 GMT+01:00 Philip Lee :
>
>> Thanks,
>>
&g
ot possible to pass this data back into Flink's
> dashboard, but you have to process and plot it yourself.
>
> Best, Fabian
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-master/internals/monitoring_rest_api.html#overview-of-jobs
>
>
>
> 2016-01-25
Hello,
Question about reading ORC format on Flink.
I want to use dataset after loadtesting csv to orc format by Hive.
Can Flink support reading ORC format?
If so, please let me know how to use the dataset in Flink.
Best,
Phil
Hello,
According to
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Apache-Flink-Web-Dashboard-Completed-Job-history-td4067.html,
I cannot retrieve the job history from Dashboard after turnning off JM.
But as Fabian mentioned here,
"However, you can query all stats that are di
Opps, sorry
I was supposed to email this one to hive mailiing list.
On Fri, Dec 18, 2015 at 2:19 AM, Philip Lee wrote:
> I think It is from Hive Bug about something related to metastore.
>
> Here is the thing.
>
> After I generated scale factor 300 named bigbench300 and bigbe
I think It is from Hive Bug about something related to metastore.
Here is the thing.
After I generated scale factor 300 named bigbench300 and bigbench100, which
already existed before,
I run "hive job with bigbench300". At first it was really fine.
Then I run hive job with bigbench100 again. It w
pache Ambari[2],
> you can fetch metrics easily from pre-installed ganglia.
>
> [1]: http://ganglia.sourceforge.net
> [2]: https://ambari.apache.org
>
> > On Dec 8, 2015, at 4:54 AM, Philip Lee wrote:
> >
> > Hello, a question about metrics.
> >
> > I wan
Hello, a question about metrics.
I want to evaluate some queris on Spark, Flink, and Hive for a comparison.
I am using 'vmstat' to check metrics to see the amount of memory used,
swap, io, cpu. My way of evaulating is right? becaues they use JVM's
resource for memory, cpu.
Is there any linux app
Hello, the performance of apply function after join.
Just for your information, I am running Flink job on the cluster consisted
of 9 machine with each 48 cores. I am working on some benchmark with
comparison of Flink, Spark-Sql, and Hive.
I tried to optimize *join function with Hint* for better p
I want to join two tables with two columns like
//AND sr_customer_sk = ws_bill_customer_sk
//AND sr_item_sk = ws_item_sk
val srJoinWs = storeReturn.join(webSales).where(_._item_sk).equalTo(_._item_sk){
(storeReturn: StoreReturn, webSales: WebSales, out:
Collector[(Long,L
Hi Philip,
>
> thanks for reporting the issue. I just verified the problem.
> It is working correctly for the Java API, but is broken in Scala.
>
> I will work on a fix and include it in the next RC for 0.10.0.
>
> Thanks, Fabian
>
> 2015-11-02 12:58 GMT+01:00 Philip Le
> the same as in SQL when you state "ORDER BY col1, col2".
>
> The SortPartitionOperator created with the first "sortPartition(col1)"
> call appends further columns, rather than instantiating a new sort.
>
> Greetings,
> Stephan
>
>
> On Sun, Nov
Hi,
I know when applying order by col, it would be
sortPartition(col).setParralism(1)
What about orderBy two columns more?
If the sql is to state order by col_1, col_2, sortPartition().sortPartition
() does not solve this SQL.
because orderby in sql is to sort the fisrt coulmn and the second c
an read the
>>> CSV into a DataSet and treat the empty string as a null value. Not very
>>> nice but a workaround. As of now, Flink deliberately doesn't support null
>>> values.
>>>
>>> Regards,
>>> Max
>>>
>>>
>>
Plus, from Shiti to overcome this null value, we could use RowSerializer,
right?
I tried it in many ways, but it still did not work.
Could you take an example for it according to the previous email?
On Sat, Oct 24, 2015 at 11:19 PM, Philip Lee wrote:
> Maximilian said if we handle null va
27;t support null
>> values.
>>
>> Regards,
>> Max
>>
>>
>> On Thu, Oct 22, 2015 at 4:30 PM, Philip Lee wrote:
>>
>>> Hi,
>>>
>>> I am trying to load the dataset with the part of null value by using
>>> readCsvFile().
Hi,
I am trying to load the dataset with the part of null value by using
readCsvFile().
// e.g _date|_click|_sales|_item|_web_page|_user
case class WebClick(_click_date: Long, _click_time: Long, _sales: Int,
_item: Int,_page: Int, _user: Int)
private def getWebClickDataSet(env: ExecutionEnviro
roup on a different field than the grouping field.
>
> It is not possible to call partitionByHash().sortGroup() because,
> sortGroup() requires groups which is done by groupBy().
>
> Best, Fabian
>
> 2015-10-19 14:31 GMT+02:00 Philip Lee :
>
>> Thanks, Fabian.
>>
&g
reduces other than in the value itself.
>> The GroupReduce, on the other hand, may produce none, one, or multiple
>> elements per grouping and keep state in between emitting values. Thus,
>> GroupReduce is a more powerful operator and can be seen as a superset
>> of the
Hi, Flink people, a question about translation from HIVE Query to Flink
fucntioin by using Table API. In sum up, I am working on some benchmark for
flink
I am Philip Lee majoring in Computer Science in Master Degree of TUB. , I
work on translation from Hive Query of Benchmark to Flink codes.
As
20 matches
Mail list logo