Skip to main content

Glob pattern library

Glob pattern derives from regular expression whom it borrows a simplified syntax. It used in simple true/false checking where the full power of regular expression is not needed and it's indeed much faster (and less resources consuming). A typical usage is the wildcard with filename matching. This is an implementation in java,

  • Simple to use, create a Glob instance, match again some sources, that's all !
  • Up to 10 times faster than regular expression
  • No leaning curve with its language identical to regular expression
  • Thread safe
  • Use the power of priority list to perform complex test
  • Already provide several filter for most common file test
Syntax

This library recognize the most common wildcard :

  • "*" : matches any char, until the next char from the pattern, or end of input. An * cannot follow an other * or ?
  • "?" : matches a single character, of any kind (except reserved one)
  • "[]" : group, matches one character given in the bracket, ex [xyz]. Group can work with range : [0-9]
  • "\" : escape sequence, ex : \* will matches as a regular asterisk.
  • "+" : the plus character is reserved for a future sequence repetition.

To increase performances, some restriction apply :

  • Asterisk is at least one character long. You cannot directly match empty input.
  • Asterisk will always validate at least one character, whatever the next wildcard on the pattern is.
  • Asterisk is considered as valid when : end of input, or a character of the next match group is found. Therefore, "abc.tar.gz" is NOT a match for "*.gz" but "*.tar.gz" is. "abc.tgz" is a match for "*.tgz" as ".abc.tgz" ; for this last example, "*.tgz" or ".*.tgz" are both valid pattern
  • There's no repetition available with group, therefore "abc.r01" is NOT a match for "*.r[0-9]" (but "*.r[0-9][0-9] is) a group cannot start or end with the "-" character.
  • Group doesn't recognize escape sequence therefore "[\\*a]" would match "\\" or "*" or "a" as valid character.

Pattern can be case insensitive. The cost in performances will be the time to convert the string to lower case.

 

Example

Using Glob is really simple. First, compile your pattern by calling one of the static method. This will return a Glob object. Then call the method matches. Glob is thread safe.

String m = "*@vostoksystem.*";
Glob g = Glob.compile(mail);

String tmpA = "abc@vostoksystem.eu";
String tmpB = "xyz@vostoksystem.com";
String tmpC = "XYZ@VostokSystem.com";
String tmpD = "xyz@gmail.com";

System.out.println("is \"" + tmpA + "\" a match for  \"" + m + "\" ? " + g.matches(tmpA));
System.out.println("is \"" + tmpB + "\" a match for  \"" + m + "\" ? " + g.matches(tmpB));
System.out.println("is \"" + tmpC + "\" a match for  \"" + m + "\" ? " + g.matches(tmpC));
System.out.println("is \"" + tmpD + "\" a match for  \"" + m + "\" ? " + g.matches(tmpD));
is "abc@vostoksystem.eu" a match for "*@vostoksystem.*" ? true
is "xyz@vostoksystem.com" a match for "*@vostoksystem.*" ? true
is "XYZ@VostokSystem.com" a match for "*@vostoksystem.*" ? true
is "xyz@gmail.com" a match for "*@vostoksystem.*" ? false

Doing a multi match with PriorityGlobList is as simple as doing a single test but require few more lines.
Note this requite a second jar : VostokIoLib

System.out.println("\n\n----------------  using PriorityGlobList ------------------");
PriorityGlobList pl = new PriorityGlobList();
pl.add(PriorityGlobList.createEntry("matroska", "*.mkv"));
pl.add(PriorityGlobList.createEntry("audio video interlave", "*.avi"));
pl.add(PriorityGlobList.createEntry("mpeg-4", "*.mp4"));

String tmpA = "/media/video.foo.mkv";
String tmpB = "Other.mp4";
String tmpC = "some music.mp3";
String tmpD = "A movie.avi";

System.out.println("is \"" + tmpA + "\" a video ? " + (pl.matchfile(tmpA) != null));
System.out.println("is \"" + tmpB + "\" a video ? " + (pl.matchfile(tmpB) != null));
System.out.println("is \"" + tmpC + "\" a video ? " + (pl.matchfile(tmpC) != null));
System.out.println("is \"" + tmpD + "\" a video ? " + (pl.matchfile(tmpD) != null));
System.out.println("------");

System.out.println("Which kind of codec is " + tmpA + " ? " + pl.matchfile(tmpA).getKey());
----------------  using PriorityGlobList ------------------
is "/media/video.foo.mkv" a video ? true
is "Other.mp4" a video ? true
is "some music.mp3" a video ? false
is "A movie.avi" a video ? true
------
Which kind of codec is /media/video.foo.mkv ? matroska
Install

VostokGlobLib is made of a single jar weight 15.6Ko. Priority list depend on VostokIoLib

Download

 

 

Maven repository

A repository will be available soon

Javadocs
Licence

VostokGlobLib is provided for free, for community, private or commercial work under the creative commons licence, CC-BY, no attribution needed.