[
https://issues.apache.org/jira/browse/HADOOP-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12549129
]
Benjamin Reed commented on HADOOP-1823:
---
We can't really wrap bzip because it is constructed with a
[
https://issues.apache.org/jira/browse/HADOOP-1883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12528403
]
Benjamin Reed commented on HADOOP-1883:
---
Using DDLs with RPC shouldn't be inefficient. The schema of
[
https://issues.apache.org/jira/browse/HADOOP-1859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12525961
]
Benjamin Reed commented on HADOOP-1859:
---
I second everything Raghu said! I was going to say exactly the same
Issue Type: Bug
Components: fs
Affects Versions: 0.14.1
Reporter: Benjamin Reed
If an FSInputDataStream object has been closed, invoking getPos() will cause a
NullPointerException. This is because BufferedInputStream.close() sets in to
null, and Buffer.getPos
of blocks from different
> (or is it just one?) files.
> I think it is possible, but very non-possix.
> And you can always create one-block files and group them in directories
> instead.
>
> --Konstantin
>
> Benjamin Reed wrote:
> > I need to implement COW for HDFS for
I need to implement COW for HDFS for a project I'm working on. I vaguely
remember it being discussed before, but I can't find any threads about
it. I wanted to at least check for interest/previous work before
proceeding. Hard links would work for me as well, but they are harder to
implement. I
[
https://issues.apache.org/jira/browse/HADOOP-435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Benjamin Reed resolved HADOOP-435.
--
Resolution: Won't Fix
The encapsulating Jar aspect doesn't seem to be an issue to m
[
https://issues.apache.org/jira/browse/HADOOP-435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12491659
]
Benjamin Reed commented on HADOOP-435:
--
Unjaring is gratuitous and annoying. There are environments, such as
[
https://issues.apache.org/jira/browse/HADOOP-435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12491476
]
Benjamin Reed commented on HADOOP-435:
--
Yes, bin/hadoop should invoke HadoopExe. The big if/else in bin/hadoop
[
https://issues.apache.org/jira/browse/HADOOP-435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12491421
]
Benjamin Reed commented on HADOOP-435:
--
My point is that we should really move forward on this issue. Once we
[
https://issues.apache.org/jira/browse/HADOOP-435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12490966
]
Benjamin Reed commented on HADOOP-435:
--
1) Yes, that is silly. I was actually thinking of changing this so that
[
https://issues.apache.org/jira/browse/HADOOP-435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12490193
]
Benjamin Reed commented on HADOOP-435:
--
1) There are two special commands conf and setup. IMHO I think making
[
https://issues.apache.org/jira/browse/HADOOP-435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Benjamin Reed updated HADOOP-435:
-
Status: Patch Available (was: Open)
> Encapsulating startup scripts and jars in a single
[
https://issues.apache.org/jira/browse/HADOOP-435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Benjamin Reed updated HADOOP-435:
-
Attachment: hadoop-exe.patch
Removed erroneous update of version
> Encapsulating star
[
https://issues.apache.org/jira/browse/HADOOP-435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Benjamin Reed updated HADOOP-435:
-
Fix Version/s: 0.12.1
0.13.0
Affects Version/s: (was: 0.12.0
[
https://issues.apache.org/jira/browse/HADOOP-435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Benjamin Reed updated HADOOP-435:
-
Attachment: hadoop-exe.patch
I've incorporated Doug's comments. The patch to Status
Components: mapred
Affects Versions: 0.12.1
Reporter: Benjamin Reed
Fix For: 0.13.0, 0.12.1, 0.12.0
StatusHttpServer uses ClassLoader.getResource() to find the webapps, but then
assumes it is a file URL and extracts the filename. This requires the webapps
[
https://issues.apache.org/jira/browse/HADOOP-1137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Benjamin Reed updated HADOOP-1137:
--
Status: Patch Available (was: Open)
> StatusHttpServer assumes that resources for /sta
[
https://issues.apache.org/jira/browse/HADOOP-1137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Benjamin Reed updated HADOOP-1137:
--
Attachment: StatusHttpServer.patch
> StatusHttpServer assumes that resources for /static
[
https://issues.apache.org/jira/browse/HADOOP-435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12476384
]
Benjamin Reed commented on HADOOP-435:
--
1. I'm not stuck on the name. Anything is good.
2. Agreed.
3.
[
https://issues.apache.org/jira/browse/HADOOP-435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Benjamin Reed updated HADOOP-435:
-
Fix Version/s: 0.12.0
Affects Version/s: (was: 0.5.0)
0.12.0
[
https://issues.apache.org/jira/browse/HADOOP-435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Benjamin Reed updated HADOOP-435:
-
Attachment: hadoopit.patch
> Encapsulating startup scripts and jars in a single Jar f
[
https://issues.apache.org/jira/browse/HADOOP-435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Benjamin Reed updated HADOOP-435:
-
Attachment: hadoopit.patch
Patch against version 10.1. Added target hadoopit that builds a self
[
https://issues.apache.org/jira/browse/HADOOP-941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12468374
]
Benjamin Reed commented on HADOOP-941:
--
Using the Hadoop jar would be an extreme cognitive burden. Plunking a
[
https://issues.apache.org/jira/browse/HADOOP-941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12468058
]
Benjamin Reed commented on HADOOP-941:
--
I will admit to being one of the motivators of this bug. We have found
[
https://issues.apache.org/jira/browse/HADOOP-933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12467956
]
Benjamin Reed commented on HADOOP-933:
--
I found another place that assumed FileSplit. See attached patch. Our
[
https://issues.apache.org/jira/browse/HADOOP-933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Benjamin Reed updated HADOOP-933:
-
Attachment: JobInProgress.patch
> Application defined InputSplits do not w
[
https://issues.apache.org/jira/browse/HADOOP-933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12467711
]
Benjamin Reed commented on HADOOP-933:
--
Unfortunately, the workaround wouldn't work either. The MapTask
[
https://issues.apache.org/jira/browse/HADOOP-933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12467651
]
Benjamin Reed commented on HADOOP-933:
--
867 might be a better solution, but it is an enhancement. This is a bug
[
https://issues.apache.org/jira/browse/HADOOP-933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Benjamin Reed updated HADOOP-933:
-
Attachment: MapTask.patch
> Application defined InputSplits do not w
[
https://issues.apache.org/jira/browse/HADOOP-933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Benjamin Reed updated HADOOP-933:
-
Fix Version/s: 0.10.1
Status: Patch Available (was: Open)
> Application defi
: 0.10.1
Reporter: Benjamin Reed
If an application defines its own InputSplit, the task tracker chokes when it
cannot deserialize the InputSplit when it deserializes MapTasks it receives
from the JobTracker. This is because the TaskTracker does not resolve classes
from the job jar
[
http://issues.apache.org/jira/browse/HADOOP-367?page=comments#action_12442183 ]
Benjamin Reed commented on HADOOP-367:
--
[[ Old comment, sent by email on Tue, 25 Jul 2006 10:30:37 -0700 ]]
When would you call setAccessable? It would
[
http://issues.apache.org/jira/browse/HADOOP-287?page=comments#action_12442138 ]
Benjamin Reed commented on HADOOP-287:
--
[[ Old comment, sent by email on Thu, 8 Jun 2006 07:32:09 -0700 ]]
Duh! Sorry Doug. Stupid error. I didn
[
http://issues.apache.org/jira/browse/HADOOP-602?page=comments#action_12442030 ]
Benjamin Reed commented on HADOOP-602:
--
Since we have switched to 1.5. Can we use java.util.PriorityQueue?
> Remove Lucene depende
Locality hints for Reduce
-
Key: HADOOP-589
URL: http://issues.apache.org/jira/browse/HADOOP-589
Project: Hadoop
Issue Type: New Feature
Components: mapred
Reporter: Benjamin Reed
It would be
[
http://issues.apache.org/jira/browse/HADOOP-587?page=comments#action_12440951 ]
Benjamin Reed commented on HADOOP-587:
--
This is a duplicate of HADOOP-287
> SequenceFile sort should use quicksort instead of merge sort for sorting r
[
http://issues.apache.org/jira/browse/HADOOP-580?page=comments#action_12440557 ]
Benjamin Reed commented on HADOOP-580:
--
No. I'm very against running code in the Trackers (as my mail indicates :). The
idea would be that you would
Job setup and take down on Nodes
Key: HADOOP-580
URL: http://issues.apache.org/jira/browse/HADOOP-580
Project: Hadoop
Issue Type: New Feature
Components: mapred
Reporter: Benjamin Reed
rs so that the
InputFormats running in Childs can deserialize the full type.
ben
Owen O'Malley wrote:
>
> On Sep 29, 2006, at 12:20 AM, Benjamin Reed wrote:
>
>> I please correct me if I'm reading the code incorrectly, but it seems
>> like submitJob puts the submit
I please correct me if I'm reading the code incorrectly, but it seems
like submitJob puts the submitted job on the jobInitQueue which is
immediately dequeued by the JobInitThread and then initTasks() will get
the file splits and create Tasks. Thus, it doesn't seem like there is
any difference in me
One of the things that bothers me about the JobTracker is that it is
running user code when it creates the FileSplits. In the long term this
puts the JobTracker JVM at risk due to errors in the user code.
The JobTracker uses the InputFormat to create a set of tasks that it
then schedules. The task
I like Solution 3 as long as there was an API to query the logs.
ben
On Tuesday 29 August 2006 11:33, Mahadev konar (JIRA) wrote:
> Seperating user logs from system logs in map reduce
> ---
>
> Key: HADOOP-489
> URL
[
http://issues.apache.org/jira/browse/HADOOP-372?page=comments#action_12431029 ]
Benjamin Reed commented on HADOOP-372:
--
I like this proposal. Assuming that the RecordReader in getMapper is the really
record reader (ie the one returned
[
http://issues.apache.org/jira/browse/HADOOP-372?page=comments#action_12430992 ]
Benjamin Reed commented on HADOOP-372:
--
We have a desperate need to be able to specify different inputformat classes,
mappers, and partition functions in the
[
http://issues.apache.org/jira/browse/HADOOP-281?page=comments#action_12430987 ]
Benjamin Reed commented on HADOOP-281:
--
Have we considered this to be a feature. Some newer general purpose file system
allow a file to also be a directory
[
http://issues.apache.org/jira/browse/HADOOP-435?page=comments#action_12428141 ]
Benjamin Reed commented on HADOOP-435:
--
This does separate control of all the daemons just like the hadoop script. It
encapsulates all the jar files, the
[
http://issues.apache.org/jira/browse/HADOOP-445?page=comments#action_12427639 ]
Benjamin Reed commented on HADOOP-445:
--
You are right. It is the same. What was the performance difference you saw
after the patch?
There actually is a
[
http://issues.apache.org/jira/browse/HADOOP-435?page=comments#action_12427636 ]
Benjamin Reed commented on HADOOP-435:
--
I'm not suggesting that we use my scripts to replace the hadoop scripts, I'm
just showing how thi
[
http://issues.apache.org/jira/browse/HADOOP-448?page=comments#action_12427619 ]
Benjamin Reed commented on HADOOP-448:
--
Okay, I guess the problem is that JobTracker does not call
setWorkingDirectory() before invoking the InputFormat
[
http://issues.apache.org/jira/browse/HADOOP-435?page=comments#action_12427618 ]
Benjamin Reed commented on HADOOP-435:
--
I've attached the scripts. They are quite convenient to use. To start the
cluster I simply do "for i in `ca
[ http://issues.apache.org/jira/browse/HADOOP-435?page=all ]
Benjamin Reed updated HADOOP-435:
-
Attachment: stop.sh
Stops everything running against a jar file.
> Encapsulating startup scripts and jars in a single Jar f
[ http://issues.apache.org/jira/browse/HADOOP-435?page=all ]
Benjamin Reed updated HADOOP-435:
-
Attachment: start.sh
This script starts up a datanode and tasktracker. If the config indicates that
the node is the jobtracker, it also starts the
[
http://issues.apache.org/jira/browse/HADOOP-433?page=comments#action_12427610 ]
Benjamin Reed commented on HADOOP-433:
--
getSplit() would address my need. However, I can imagine in the future that we
would like access to the RecordReader
Issue Type: Bug
Reporter: Benjamin Reed
DistributedFileSystem initializes the working directory to be new Path("/user",
System.getProperty("user.name")); rather than new Path("/user",
conf.get("user.name")); the initialization would have to be
: Bug
Components: dfs
Reporter: Benjamin Reed
getBlockSize() does not check for an absolute path like the rest of the
DistributeFileSystem API does. Consequently getBlockSize(Path) does not work
with relative paths.
--
This message is automatically generated by JIRA.
-
If
Versions: 0.5.0
Reporter: Benjamin Reed
Priority: Minor
In the overall scheme of things this is probably a nit, but in the run() method
of DataXceiveServer in DataNode.java the method "data.checkDataDir()" is called
right after the socket accept. data.checkData
Reporter: Benjamin Reed
Attachments: fastClientWrite.patch
Currently, as DFS clients output blocks they write the entire block to disk
before starting to transmit to the datanode. By writing to disk the client is
able to retry a block write if the datanode files in the middle of a
Affects Versions: 0.5.0
Reporter: Benjamin Reed
Attachments: hadoopit.patch
Currently, hadoop is a set of scripts, configurations, and jar files. It makes
it a pain to install on compute and datanodes. It also makes it a pain to setup
clients so that they can use hadoop. Everytime
[
http://issues.apache.org/jira/browse/HADOOP-433?page=comments#action_12426928 ]
Benjamin Reed commented on HADOOP-433:
--
If I understand correctly, you are suggesting that I instantiate another
RecordReader using the information in
Reporter: Benjamin Reed
Priority: Minor
The record reader has access to the FileSplit which can in turn have
information that is useful to the Mapper. For example, Map processing may vary
according to file name or attributes associated with a file. Unfortunately,
even using a
[ http://issues.apache.org/jira/browse/HADOOP-367?page=all ]
Benjamin Reed updated HADOOP-367:
-
Attachment: f.patch
Arg! It seems that public static methods on non-public classes cannot be called
with reflection :( So I took a different approach that
[ http://issues.apache.org/jira/browse/HADOOP-287?page=all ]
Benjamin Reed updated HADOOP-287:
-
Attachment: s.patch
> Speed up SequenceFile sort with memory reduction
>
>
> Key
[
http://issues.apache.org/jira/browse/HADOOP-287?page=comments#action_12423847 ]
Benjamin Reed commented on HADOOP-287:
--
I have improved it a bit more. It now is guaranteed to only take logN stack
space, and I eaked out a bit more
[
http://issues.apache.org/jira/browse/HADOOP-367?page=comments#action_12423469 ]
Benjamin Reed commented on HADOOP-367:
--
When would you call setAccessable? It would have to be outside of the class
since nothing is initialized at class
Type: Bug
Components: dfs
Affects Versions: 0.4.0
Environment: Java 5.0
Reporter: Benjamin Reed
There seems to be a change that happened between 1.4 and 1.5 with respect to
static initializers. I can't find this documented, but I can reproduce with a
very s
I do such diffs all the time with Subclipse. It's great. It's also nice for
managing patches.
ben
On Monday 17 July 2006 09:30, Thomas FRIOL wrote:
> Hi Owen,
>
> If you are an Eclipse user, I suggest you to use Subversive [1] or
> Subclipse [2] plugin.
>
> [1] http://www.polarion.org/index.php?
: 0.3.2
Reporter: Benjamin Reed
Priority: Minor
JobClient does some checking of the job being submitted when it submits a jar
file along with the job. The problem is that the JobClient pulls classes from
the classpath rather than the submitted jar file. Because the jar file may
[ http://issues.apache.org/jira/browse/HADOOP-303?page=all ]
Benjamin Reed updated HADOOP-303:
-
Attachment: jobcl-fix.patch
> JobClient looking for classes for submitted job in the wrong pl
[ http://issues.apache.org/jira/browse/HADOOP-287?page=all ]
Benjamin Reed updated HADOOP-287:
-
Attachment: zoom-sort.patch
My previous patch had two minor typos that gave incorrect results. This patch
should work.
> Speed up SequenceFile sort w
: Benjamin Reed
Attachments: zoom-sort.patch
I replaced the merge sort with a quick sort and it yielded approx 30%
improvement in sort time. It also reduced the memory requirement for sorting
because the sort is done in place.
--
This message is automatically generated by JIRA.
-
If you think it was
[ http://issues.apache.org/jira/browse/HADOOP-287?page=all ]
Benjamin Reed updated HADOOP-287:
-
Attachment: zoom-sort.patch
> Speed up SequenceFile sort with memory reduction
>
>
>
[ http://issues.apache.org/jira/browse/HADOOP-249?page=all ]
Benjamin Reed updated HADOOP-249:
-
Attachment: disk_zoom.patch
task_zoom.patch
> Improving Map -> Reduce performance and Task JVM
[ http://issues.apache.org/jira/browse/HADOOP-249?page=all ]
Benjamin Reed updated HADOOP-249:
-
Attachment: image001.png
> Improving Map -> Reduce performance and Task JVM reuse
> --
>
&g
Improving Map -> Reduce performance and Task JVM reuse
--
Key: HADOOP-249
URL: http://issues.apache.org/jira/browse/HADOOP-249
Project: Hadoop
Type: Improvement
Versions: 0.3
Reporter: Benjamin R
[ http://issues.apache.org/jira/browse/HADOOP-235?page=all ]
Benjamin Reed updated HADOOP-235:
-
Attachment: missing-f.patch
> LocalFileSystem.openRaw() throws the wrong string for FileNotFoundExcept
: Benjamin Reed
openRaw should throw f.toString() on an error, not toString().
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see
[
http://issues.apache.org/jira/browse/HADOOP-170?page=comments#action_12376576 ]
Benjamin Reed commented on HADOOP-170:
--
It's really JobTracker, not the fs, that knows how high to set the replication
count since the JobTracker will know the numb
78 matches
Mail list logo