Alright, I've made some changes to the patch. I went ahead and renamed
string_strided_select to string_select and removed the previous
string_select (which was only called in the same place, and just called
string_strided_select with stride = 1). I added a check that will memcpy
the string if stride is 1.

I removed string_index_of as a compiler primitive and just made it an
extern call. I also added the method string.indexOf, which calls the
function. I went ahead and just kept it returning an int since it's
simpler. Returning a range might be more useful if it was possible to
input a pattern instead of a fixed delimiter, because then the length
might be different.

I moved string.substring out of ChapelRangeBase.chpl and into
ChapelRange.chpl. This let me remove the reference to the range class
member _base, so the implementation of substring isn't tied up with the
internal implementation of range.

I removed the string length calculation and bounds checking from
string_select, and put the checks in the caller instead. Assumably
whoever is calling a primitive knows what they are doing, and
string_select should really only be called from one place anyway, so I
think it's okay to leave these safety checks up to the caller. Besides,
I imagine after strings have been revamped this will need to be
reimplemented anyway, so it will probably not be around long.

There's some other noise in the diff from formatting the indentation of
a section in chpltypes.c so I could read it easier while looking for the
source of a bug. Feel free to remove that.

I ran /test/release/ and /test/trivial/bwross; everything passed.

OT: I've made some good progress on the Net module, so I'll be
submitting some new patches for review this weekend. I can't seem to get
single variables working, so for now the call to getaddrinfo is synchronous.

-- Brandon

On 12/02/2013 06:06 PM, Brandon Ross wrote:
> Hello, all. Forgive me if I'm doing something wrong; I'm not familiar
> with the patch submission procedure, or which parts of the code are OK
> to touch.
>
> I added a compiler primitive for finding the index of a substring within
> a string. I didn't see any other way of doing the same thing
> efficiently. It's basically just a wrapper around strstr. It returns 0
> if the substring couldn't be found. I didn't add a method to the string
> type (outside of the module I'm working on) for calling it, but I'm
> using it in the module for simple parsing procedures.
>
> I also changed string.substring to more gracefully handle unbounded and
> out-of-bounds range limits by taking the intersection of the passed
> range and the bounding range of the string (e.g.,
> "hello".substring(-5..30) == "hello"). Ranges that don't overlap the
> string at all will return an empty string (e.g.,
> "hello".substring(50..100) == ""). It takes striding into account as
> well. I also made string_index (single character substrings) return an
> empty string if the index is out of range. I don't see the danger in
> handling substrings like this instead of crashing, and other languages
> (e.g., Python) do this as well. All in all I think it makes string
> handling a little friendlier.
>
> Also, for edge cases that return empty strings, I simply returned an
> empty string literal. I have no idea how safe that is or how it affects
> garbage collection. Let me know if it's better practice to allocate a
> new string instead.
>
> I put a test case in /test/trivial/bwross/string_index.chpl.
>
> -- Brandon

Index: compiler/AST/primitive.cpp
===================================================================
--- compiler/AST/primitive.cpp	(revision 22383)
+++ compiler/AST/primitive.cpp	(working copy)
@@ -546,7 +546,6 @@
   prim_def(PRIM_C_STRING_FROM_STRING, "c_string_from_string", returnInfoStringC, true, true);
   prim_def(PRIM_CAST_TO_VOID_STAR, "cast_to_void_star", returnInfoOpaque, true, false);
   prim_def("string_select", returnInfoString, true, true);
-  prim_def("string_strided_select", returnInfoString, true, true);
   prim_def("sleep", returnInfoVoid, true);
   prim_def("real2int", returnInfoDefaultInt);
   prim_def("object2int", returnInfoDefaultInt);
Index: modules/internal/ChapelBase.chpl
===================================================================
--- modules/internal/ChapelBase.chpl	(revision 22374)
+++ modules/internal/ChapelBase.chpl	(working copy)
@@ -662,6 +662,9 @@
   inline proc ascii(param a: string) param return __primitive("ascii", a);
   inline proc param string.length param return __primitive("string_length", this);
   inline proc _string_contains(param a: string, param b: string) param return __primitive("string_contains", a, b);
+
+  extern proc string_index_of(x:string, y:string):int;
+  inline proc string.indexOf(substring:string):int return string_index_of(this, substring);
   
   //
   // min and max
Index: modules/internal/ChapelRange.chpl
===================================================================
--- modules/internal/ChapelRange.chpl	(revision 22374)
+++ modules/internal/ChapelRange.chpl	(working copy)
@@ -500,7 +500,7 @@
       ambig = true;
     }
   
-    var temp = _base.this(other._base);
+    var temp = _base[other._base];
     return new range(temp.idxType, temp.boundedType, temp.stridable,
                      temp, !ambig && (this._aligned || other._aligned));
   }
@@ -597,15 +597,14 @@
   }
   
   // Return a substring of a string with a range of indices.
-  proc string.substring(r: range(?))
-  {
-    if r.isAmbiguous() then
-      __primitive("chpl_error", "substring -- Cannot select from a string using a range with ambiguous alignment.");
-  
-    return this.substring(r._base);
+  inline proc string.substring(r: range(?)) {
+    var r2 = r[1..this.length];
+    if r2.isEmpty() then return "";
+    var lo:int = r2.alignedLow, hi:int = r2.alignedHigh;
+    return __primitive("string_select", this, lo, hi, r2.stride);
   }
+
   
-  
   //################################################################################
   //# Internal helper functions.
   //#
Index: modules/internal/ChapelRangeBase.chpl
===================================================================
--- modules/internal/ChapelRangeBase.chpl	(revision 22374)
+++ modules/internal/ChapelRangeBase.chpl	(working copy)
@@ -605,71 +605,71 @@
       else
         return BoundedRangeType.boundedNone;
     }
-  
+
     // If this range is unbounded below, we use low from the other range,
     // so that max(lo1, lo2) == lo2.  etc.
     var lo1 = if hasLowBound() then this._low else other._low;
     var hi1 = if hasHighBound() then this._high else other._high;
     var st1 = abs(this.stride);
-  
+
     var lo2 = if other.hasLowBound() then other._low else this._low;
     var hi2 = if other.hasHighBound() then other._high else this._high;
     var st2 = abs(other.stride);
-  
+
     // If the result type is unsigned, don't let the low bound go negative.
     // This is a kludge.  We should really obey type coercion rules. (hilde)
     if (_isUnsignedType(idxType)) { if (lo1 < 0) then lo1 = 0; }
-  
+
     // We inherit the sign of the stride from this.stride.
     var newStride: strType = this.stride;
     var lcm: strType = abs(this.stride);
     var (g, x): 2*strType = (lcm, 0:strType);
-  
+
     if this.stride != other.stride && this.stride != -other.stride {
-  
+
       (g, x) = chpl__extendedEuclid(st1, st2);
       lcm = st1 / g * st2;        // The LCM of the two strides.
-    // The division must be done first to prevent overflow.
-  
+      // The division must be done first to prevent overflow.
+
       newStride = if this.stride > 0 then lcm else -lcm;
     }
-  
+
     var result = new rangeBase(idxType,
-                           computeBoundedType(this, other),
-                           this.stridable | other.stridable,
-                           max(lo1, lo2):idxType,
-                           min(hi1, hi2):idxType,
-                           newStride);
-  
-   if result.stridable {
-    var al1 = (this._alignment % st1:idxType):int;
-    var al2 = (other._alignment % st2:other.idxType):int;
-  
-    if (al2 - al1) % g != 0 then
-    {
-      // empty intersection, return degenerate result
-      if !isBoundedRangeB(result) then
-        halt("could not represent range slice - it needs to be empty, but the slice type is not bounded");
-      (result._low, result._high, result._alignment) =
-      (1:idxType, 0:idxType, if this.stride > 0 then 1:idxType else 0:idxType);
-      // _alignment == _low, so it won't print.
-    }
-    else
-    { // non-empty intersection
-  
-      // x and/or the diff may negative, even with a uint source range.
-      var offset = (al2 - al1) * x;
-      // offset is in the range [-(lcm-1), lcm-1]
-      if offset < 0 then offset += lcm;
-  
-      // Now offset can be safely cast to idxType.
-      result._alignment = al1:idxType + offset:idxType * st1:idxType / g:idxType;
-    }
-   } else {
+        computeBoundedType(this, other),
+        this.stridable | other.stridable,
+        max(lo1, lo2):idxType,
+        min(hi1, hi2):idxType,
+        newStride);
+
+    if result.stridable {
+      var al1 = (this._alignment % st1:idxType):int;
+      var al2 = (other._alignment % st2:other.idxType):int;
+
+      if (al2 - al1) % g != 0 then
+      {
+        // empty intersection, return degenerate result
+        if !isBoundedRangeB(result) then
+          halt("could not represent range slice - it needs to be empty, but the slice type is not bounded");
+        (result._low, result._high, result._alignment) =
+          (1:idxType, 0:idxType, if this.stride > 0 then 1:idxType else 0:idxType);
+        // _alignment == _low, so it won't print.
+      }
+      else
+      { // non-empty intersection
+
+        // x and/or the diff may negative, even with a uint source range.
+        var offset = (al2 - al1) * x;
+        // offset is in the range [-(lcm-1), lcm-1]
+        if offset < 0 then offset += lcm;
+
+        // Now offset can be safely cast to idxType.
+        result._alignment = al1:idxType + offset:idxType * st1:idxType / g:idxType;
+      }
+    } else {
       // !(result.stridable)
       result._alignment = 0;
-   }
-  
+    }
+
     return result;
   }
   
@@ -1011,19 +1011,7 @@
     }
   }
   
-  // Return a substring of a string with a range of indices.
-  inline proc string.substring(r: rangeBase(?))
-  {
-    if r.boundedType != BoundedRangeType.bounded then
-      compilerError("substring indexing undefined on unbounded ranges");
   
-    if r.stride != 1 then
-      return __primitive("string_strided_select", this, r.alignedLow, r.alignedHigh, r.stride);
-    else
-      return __primitive("string_select", this, r.low, r.high);
-  }
-  
-  
   //################################################################################
   //# Internal helper functions.
   //#
Index: runtime/include/chpltypes.h
===================================================================
--- runtime/include/chpltypes.h	(revision 22374)
+++ runtime/include/chpltypes.h	(working copy)
@@ -229,9 +229,9 @@
 void chpl_string_widen(struct chpl_chpl____wide_chpl_string_s* x, chpl_string from, int32_t lineno, chpl_string filename);
 void chpl_comm_wide_get_string(chpl_string* local, struct chpl_chpl____wide_chpl_string_s* x, int32_t tid, int32_t lineno, chpl_string filename);
 chpl_string string_concat(chpl_string x, chpl_string y, int32_t lineno, chpl_string filename);
+c_int string_index_of(chpl_string x, chpl_string y);
 chpl_string string_index(chpl_string x, int i, int32_t lineno, chpl_string filename);
-chpl_string string_select(chpl_string x, int low, int high, int32_t lineno, chpl_string filename);
-chpl_string string_strided_select(chpl_string x, int low, int high, int stride, int32_t lineno, chpl_string filename);
+chpl_string string_select(chpl_string x, int low, int high, int stride, int32_t lineno, chpl_string filename);
 int32_t chpl_string_compare(chpl_string x, chpl_string y);
 int64_t string_length(chpl_string x);
 
Index: runtime/src/chpltypes.c
===================================================================
--- runtime/src/chpltypes.c	(revision 22374)
+++ runtime/src/chpltypes.c	(working copy)
@@ -248,21 +248,30 @@
   return z;
 }
 
+c_int string_index_of(chpl_string x, chpl_string y) {
+  chpl_string z = strstr(x, y);
+  return z ? (c_int) (z-x)+1 : 0;
+}
 
+// It is up to the caller to make sure low and high are within the string
+// bounds and that stride is not 0.
 chpl_string
-string_strided_select(chpl_string x, int low, int high, int stride, int32_t lineno, chpl_string filename) {
-  int64_t length = string_length(x);
+string_select(chpl_string x, int low, int high, int stride, int32_t lineno, chpl_string filename) {
   char* result = NULL;
   char* dst = NULL;
+
+  if (low  < 1) low = 1;
+  if (high < 1) return "";
+
   chpl_string src = stride > 0 ? x + low - 1 : x + high - 1;
-  int size = high - low >= 0 ? high - low : 0;
-  if (low < 1 || low > length || high > length) {
-    chpl_error("string index out of bounds", lineno, filename);
-  }
-  result = chpltypes_malloc(size + 2, CHPL_RT_MD_STRING_STRIDED_SELECT_DATA,
+  int size = high-low+1;
+  result = chpltypes_malloc(size + 1, CHPL_RT_MD_STRING_STRIDED_SELECT_DATA,
                             lineno, filename);
   dst = result;
-  if (stride > 0) {
+  if (stride == 1) {
+    memcpy(result, src, size);
+    dst = result + size;
+  } else if (stride > 0) {
     while (src - x <= high - 1) {
       *dst++ = *src;
       src += stride;
@@ -273,22 +282,17 @@
       src += stride;
     }
   }
+
   *dst = '\0';
-  // result is already a copy, so we don't have to copy  it again.
   return result;
 }
 
 chpl_string
-string_select(chpl_string x, int low, int high, int32_t lineno, chpl_string filename) {
-  return string_strided_select(x, low, high, 1, lineno, filename);
-}
-
-chpl_string
 string_index(chpl_string x, int i, int32_t lineno, chpl_string filename) {
+  if (i-1 < 0 || i-1 >= string_length(x))
+    return "";
   char* buffer = chpltypes_malloc(2, CHPL_RT_MD_STRING_COPY_DATA,
                                   lineno, filename);
-  if (i-1 < 0 || i-1 >= string_length(x))
-    chpl_error("string index out of bounds", lineno, filename);
   sprintf(buffer, "%c", x[i-1]);
   return buffer;
 }
Index: test/trivial/bwross/string_index.chpl
===================================================================
--- test/trivial/bwross/string_index.chpl	(revision 0)
+++ test/trivial/bwross/string_index.chpl	(working copy)
@@ -0,0 +1,38 @@
+var s:string = "hello world";
+
+// string_index_of
+write(s.indexOf(" "), " ");
+write(s.indexOf("hello"), " ");
+write(s.indexOf("world"), " ");
+write(s.indexOf("nowhere"), " ");
+write(s.indexOf(s), " ");
+writeln(s.indexOf(""));
+
+// string_select
+writeln(s.substring(..));
+writeln(s.substring(..5));
+writeln(s.substring(7..));
+writeln(s.substring(4..8));
+writeln(s.substring(-50..50));
+writeln(s.substring(-100..-50));
+writeln(s.substring(50..100));
+
+// with stride
+writeln(s.substring(1..11 by 2));
+writeln(s.substring(1..11 by -1));
+writeln(s.substring(1..11 by -2));
+writeln(s.substring(1.. by 3));
+writeln(s.substring(2.. by 3));
+writeln(s.substring(-3.. by 3));
+writeln(s.substring(-1.. by 3));
+writeln(s.substring(-2.. by 3));
+
+// string_index cases
+writeln(s.substring(99));
+writeln(s.substring(3));
+writeln(s.substring(-30));
+
+// all together now!
+writeln(s.substring(1..s.indexOf(" ")),
+        "beautifu", s.substring(3),
+        s.substring(s.indexOf(" ")..));
Index: test/trivial/bwross/string_index.good
===================================================================
--- test/trivial/bwross/string_index.good	(revision 0)
+++ test/trivial/bwross/string_index.good	(working copy)
@@ -0,0 +1,20 @@
+6 1 7 0 1 1
+hello world
+hello
+world
+lo wo
+hello world
+
+
+hlowrd
+dlrow olleh
+drwolh
+hlwl
+eood
+l r
+eood
+hlwl
+
+l
+
+hello beautiful world
------------------------------------------------------------------------------
Sponsored by Intel(R) XDK 
Develop, test and display web and hybrid apps with a single code base.
Download it for free now!
http://pubads.g.doubleclick.net/gampad/clk?id=111408631&iu=/4140/ostg.clktrk
_______________________________________________
Chapel-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-developers

Reply via email to