Hi All,
Starting with the implementation, we are planning to take care of a single
batch job first. We will take up the scheduling aspect later.
The first requirement is the following:
A batch job is an Apex application which picks up data from the source, and
processes it. Once the data is com
Thanks for the suggestions and I am working on the process to migrate the
examples with the guidelines you mentioned. I will send out a list of
examples and the destination modules very soon.
On Thu, Oct 27, 2016 at 1:43 PM, Thomas Weise
wrote:
> Maybe a good first step would be to identify whi
[
https://issues.apache.org/jira/browse/APEXCORE-526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15631520#comment-15631520
]
Munagala V. Ramanath commented on APEXCORE-526:
---
https://ci.apache.org/pro
[
https://issues.apache.org/jira/browse/APEXCORE-570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15630848#comment-15630848
]
David Yan edited comment on APEXCORE-570 at 11/2/16 11:11 PM:
[
https://issues.apache.org/jira/browse/APEXCORE-570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15630848#comment-15630848
]
David Yan commented on APEXCORE-570:
[~PramodSSImmaneni] Maybe we can't block it but
[
https://issues.apache.org/jira/browse/APEXCORE-570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15630806#comment-15630806
]
Pramod Immaneni commented on APEXCORE-570:
--
[~davidyan] that would lead to dead
[
https://issues.apache.org/jira/browse/APEXCORE-570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15630749#comment-15630749
]
Munagala V. Ramanath commented on APEXCORE-570:
---
I think part of the probl
[
https://issues.apache.org/jira/browse/APEXCORE-570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15630694#comment-15630694
]
David Yan commented on APEXCORE-570:
If the spooling capacity limit is reached, woul
Suppose I am processing data in a file and I want to do something at the
end of a file at the output operator, I would send an end file control
tuple and act on it when I receive it at the output. In a single window I
may end up processing multiple files and if I don't have multiple ports and
logic
With option 2, users can still do idempotent processing by delaying their
processing of the control tuples to end window. They have the flexibility
with this option. In the usual scenarios, you will have one port and given
that control tuples will be sent to all partitions, all the data sent
before
The use cases listed in the original discussion don't call for option 2. It
seems to come with additional complexity and implementation cost.
Can those in favor of option 2 please also provide the use case for it.
Thanks,
Thomas
On Wed, Nov 2, 2016 at 10:36 PM, Siyuan Hua wrote:
> I will vote
I will vote for approach 1.
First of all that one sounds easier to do to me. And I think idempotency is
important. It may run at the cost of higher latency but I think it is ok
And in addition, when in the future if users do need realtime control tuple
processing, we can always add the option on
Pramod,
To answer your questions, the control tuples will be delivered to all
downstream partitions, and an additional emitControl method (actual name
TBD) can be added to DefaultOutputPort without breaking backward
compatibility.
Also, to clarify, each operator should have the ability to block f
A feature that incurs risk with processing order, and more so with
idempotency is a big enough reason to worry about with option 2. Is there
is a critical use case that needs this feature?
Thks
Amol
On Wed, Nov 2, 2016 at 1:25 PM, Pramod Immaneni
wrote:
> I like approach 2 as it gives more fle
[
https://issues.apache.org/jira/browse/APEXMALHAR-2321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15630594#comment-15630594
]
ASF GitHub Bot commented on APEXMALHAR-2321:
GitHub user brightchen opene
GitHub user brightchen opened a pull request:
https://github.com/apache/apex-malhar/pull/481
APEXMALHAR-2321 #resolve #comment Improve Buckets memory management
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/brightchen/apex-malh
As a rule of thumb in any real time operating system, control tuples should
always be handled using Priority Queues.
We may try to control priorities by defining levels. And shall not
be delivered at window boundaries.
In short, control tuples shall never be treated as any other tuples in real
ti
I like approach 2 as it gives more flexibility and also allows for
low-latency options. I think the following are important as well.
1. Delivering control tuples to all downstream partitions.
2. What mechanism will the operator developer use to send the control
tuple? Will it be an additional meho
Hi all,
I would like to renew the discussion of control tuples.
Last time, we were in a debate about whether:
1) the platform should enforce that control tuples are delivered at window
boundaries only
or:
2) the platform should deliver control tuples just as other tuples and it's
the operator
[
https://issues.apache.org/jira/browse/APEXCORE-570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15630269#comment-15630269
]
Pramod Immaneni commented on APEXCORE-570:
--
For the inter-process case, if the
[
https://issues.apache.org/jira/browse/APEXCORE-570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15630072#comment-15630072
]
Thomas Weise commented on APEXCORE-570:
---
>From the JIRA description, this sounds l
[
https://issues.apache.org/jira/browse/APEXMALHAR-2274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15629893#comment-15629893
]
Matt Zhang commented on APEXMALHAR-2274:
The scanner in FileSplitterInput is
Sanjay M Pujare created APEXMALHAR-2326:
---
Summary: Failures in SQS unit tests
Key: APEXMALHAR-2326
URL: https://issues.apache.org/jira/browse/APEXMALHAR-2326
Project: Apache Apex Malhar
GitHub user d9liang opened a pull request:
https://github.com/apache/apex-malhar/pull/480
[APEXMALHAR-2220] Move the FunctionOperator to Malhar library
Merge function operators under org.apache.apex.malhar.stream.api.operator
and function interface under org.apache.apex.malhar.strea
[
https://issues.apache.org/jira/browse/APEXMALHAR-2220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15629562#comment-15629562
]
ASF GitHub Bot commented on APEXMALHAR-2220:
GitHub user d9liang opened a
[
https://issues.apache.org/jira/browse/APEXMALHAR-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yogi Devendra resolved APEXMALHAR-2302.
---
Resolution: Fixed
Fix Version/s: 3.6.0
> Exposing few properties of FSSpli
Github user asfgit closed the pull request at:
https://github.com/apache/apex-malhar/pull/457
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
[
https://issues.apache.org/jira/browse/APEXMALHAR-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15629394#comment-15629394
]
ASF GitHub Bot commented on APEXMALHAR-2302:
Github user asfgit closed th
[
https://issues.apache.org/jira/browse/APEXMALHAR-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15628804#comment-15628804
]
ASF GitHub Bot commented on APEXMALHAR-2325:
GitHub user chaithu14 opened
GitHub user chaithu14 opened a pull request:
https://github.com/apache/apex-malhar/pull/479
APEXMALHAR-2325 1) Set the file system default block size to the reader. 2)
Set the block size to the reader context
You can merge this pull request into a Git repository by running:
$
Chaitanya created APEXMALHAR-2325:
-
Summary: Same block id is emitting from FSInputModule
Key: APEXMALHAR-2325
URL: https://issues.apache.org/jira/browse/APEXMALHAR-2325
Project: Apache Apex Malhar
[
https://issues.apache.org/jira/browse/APEXCORE-570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15628460#comment-15628460
]
Pramod Immaneni commented on APEXCORE-570:
--
Here is an example on how to do thi
Pramod Immaneni created APEXCORE-570:
Summary: Prevent upstream operators from getting too far ahead
when downstream operators are slow
Key: APEXCORE-570
URL: https://issues.apache.org/jira/browse/APEXCORE-570
Hi Apex Dev Community,
For Fixed Width S3 record Reader, the input is the block metadata
containing the block offset and the block length.
The length of the block may not be a factor of the length of the record.
(For eg, block length can be 1MB, record length can be 23 bytes)
Hence, the first byte
34 matches
Mail list logo