"Nilay Puri, Noida" <[EMAIL PROTECTED]> wrote:
>Can any one walk me thru this piece of code ::
>
>while(<STDIN>)
>{
> chomp ;
> $isbn =(split(/^_/, $_))[0] ; --- not able to understand what is
>being accessed (......)[0]
> unless ($KEYS{$isbn} ) ---- isbn is a scalar variable, how keys
>wok on it ?
> {
> print "$_\n" ;
> $KEYS{$isbn} =1 ;
> }
>}
I'm not sure what the intent of the code is, but I would guess you're parsing a set of
lines from a file, each line containing an ISBN and some other data, to extract all
the unique ISBNs from it, and print those numbers without repeating duplicates, as a
side effect leaving you with a hash containing all those unique ISBNs.
It would be useful to see some of the data the code is intended to process.
Assuming that's the goal, you're not far from it. The (......)[0] says "take the
result of the split, which is a list, and get the 0th or first element from it". This
works because you can subscript a list the same way you can subscript an array
variable. In other words, the expression gets the first thing from the list that
results from splitting, which is probably meant to be the first thing on the line,
which is probably meant to be the ISBN.
The $KEYS{$isbn} expression is a hash access, getting from hash %KEYS the value
associated with key $isbn. You're right, 'keys' is a function which is used with
hashes, but in this case KEYS is also a variable.
The split pattern seems unusual. As written, it says "split the string at the
beginning if the line starts with an underscore". The caret character in the pattern
will match on the start of the string; the string will consist of an entire line read
from the file, put into the variable $_ by the read operation <STDIN>. I've never
tried to split on the beginning of the string, so let's write a test script that does
that and see what happens.
testsplitBOL.pl
---------------
use warnings;
use strict;
my @result;
while(<STDIN>){
chomp;
print "Processing line: >$_<\n";
@result = split( /^_/, $_);
print "++Split resulted in ", scalar(@result), " items.\n";
print "++First element of split is >", $result[0], "<.\n";
}
result:
-------
D:\MCD\dvl\scripts>type testsplitBOL.txt
1234 some text
_5678 some other text
_9012_some_other_text_separated_by_underscores
_7654
0987
D:\MCD\dvl\scripts>type testsplitBOL.txt | perl testsplitBOL.pl
Processing line: >1234 some text<
Split resulted in 1 items.
First element of split is >1234 some text<.
Processing line: >_5678 some other text<
Split resulted in 2 items.
First element of split is ><.
Processing line: ><
Split resulted in 0 items.
Use of uninitialized value in print at testsplitBOL.pl line 11, <STDIN> line 3.
First element of split is ><.
Processing line: >_9012_some_other_text_separated_by_underscores<
Split resulted in 2 items.
First element of split is ><.
Processing line: >_7654<
Split resulted in 2 items.
First element of split is ><.
Processing line: >0987<
Split resulted in 1 items.
First element of split is >0987<.
>From these results we can see several things:
- Splitting on the beginning of the string, when successful, appears to give you an
empty string as the first elem of the resulting list. Your code would take this to be
an ISBN and use it as a hash key, which is certainly not correct.
- When the split does not match its pattern, it yields a list consisting of a single
element, the original string.
- The pattern in split only matches lines that begin with underscore. Whether or not
that's what you want depends on your data.
- The code should have a test to make sure the line is not just an empty string
Note that the Perl documentation for split says
A PATTERN of /^/ is treated as if it were /^/m, since it isn't much use otherwise
but that doesn't seem to apply here, both because your pattern is not /^/ (rather, it
is /^_/) and because that doesn't seem to be what's happening in the test results.
I'd be glad to help you code up your loop, but we really need to see a sample of data
to understand the task. In any event, I think you want something like this:
use warnings;
use strict;
my %KEYS = ();
my $isbn;
my @result;
while(<STDIN>){
chomp;
if( /^_/ ){ # select only those lines to split: not empty and start with
underscore, or whatever
@result = split( /:/, $_); # split on whatever separates ISBN from what
follows it on line
if( scalar( @result ) > 1 ){ # make sure the split actually split something
$isbn = $result[0]; # we assume the ISBN is first thing on the line
unless( $KEYS{$isbn} ){ # make sure this ISBN hasn't already been printed
before printing it
print "$_\n";
$KEYS{$isbn} = 1;
}
}
else {
die "Error processing line $.: $_ could not be split.\n";
}
}
}
Or something like that. You could make it more concise, but that's the basic idea.
Show us your data!
__________________________________________________________________
New! Unlimited Netscape Internet Service.
Only $9.95 a month -- Sign up today at http://isp.netscape.com/register
Act now to get a personalized email address!
Netscape. Just the Net You Need.
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>