Monday, December 10, 2012

Range Non Existence(^) Searches Using PatIndex

We all know that PatIndex is one of the cool features of T-SQL. It helps to locate the position of the pattern in a given string.

We can also use PatIndex to do range searches. For Example, "[A-Z]" would search for any alphabetical character. "[0-9]" would search for any numeric character.  With the use of these range searches, it gets much easier to find out data issues.

Let us execute the following queries to generate some data for our learning.

  CREATE TABLE #t1 (PATCHECK VARCHAR(20))

  INSERT INTO #T1 (PATCHECK) VALUES ('ABCDEFGHIJKLMN')

  INSERT INTO #T1 (PATCHECK) VALUES ('SIVA PAT INDEX CHECK')

  INSERT INTO #T1 (PATCHECK) VALUES ('SIVA 123')

  INSERT INTO #T1 (PATCHECK) VALUES ('SIVA 123 !@#')

  INSERT INTO #T1 (PATCHECK) VALUES ('123456789')

  INSERT INTO #T1 (PATCHECK) VALUES ('!@#$%^&')

if you select everything from this table, this is how it will look like.

Query  - 0
  SELECT * FROM #T1
    ABCDEFGHIJKLMN
    SIVA PAT INDEX CHECK
    SIVA 123
    SIVA 123 !@#
    123456789
    !@#$%^&


Now, Let us execute a simple PatIndex Search. 

Query-1

  SELECT *
  FROM #T1 
  WHERE PATINDEX('%SIVA%',PATCHECK) > 0 

This query would return any row that has the string "SIVA" in it. The results will look like below.     
   SIVA PAT INDEX CHECK
   SIVA 123
   SIVA 123 !@#

Range Search
Here is the example for a range search.
The below mentioned query means that any row that does not have an alphabetical character. 

Query - 2

  SELECT *
  FROM #T1 
  WHERE PATINDEX('%[A-Z]%',PATCHECK)  = 0


results would look like
  123456789
    !@#$%^&

The below mentioned query means that any row that contains (notice the difference in the operator. Previous one was =0, this one below is > 0) alphabetical character. 

Query - 3
  SELECT *
  FROM #T1 
  WHERE PATINDEX('%[A-Z]%',PATCHECK)  > 0

results would look like 
   ABCDEFGHIJKLMN
   SIVA PAT INDEX CHECK
   SIVA 123
   SIVA 123 !@#

Range Search with Not Operator

"Not Operator" is added in front of the range set like this [^A-Z] . This means that any non alphabetical character. 

Query - 4 
  SELECT *
  FROM #T1 
  WHERE PATINDEX('%[^A-Z]%',PATCHECK)  > 0

results would look like 
SIVA PAT INDEX CHECK
SIVA 123
SIVA 123 !@#
123456789
!@#$%^&

Note that even though query 2 and Query 4 looks similar, they are not producing the same results. 

Multiple Range Searches 

It is also possible to combine multiple range searches in a single query. The below mentioned query would return any row that does have only alphabets or numeric. 

Query - 5
  SELECT *
  FROM #T1 
  WHERE PATINDEX('%[^0-9][^A-Z]%',PATCHECK)  = 0

results would look like. 
ABCDEFGHIJKLMN
123456789


Hope this helps..!!