      Introduction

In a study of the Theomatic factor 90 relating to references to either one of the sons in the story of the Prodigal Son (Luke 15), the author states in the beginning of section 7: "In the most conservative way possible, we need to find out what the actual odds are for this event occurring." (p.33)

The author publishes that the event being considered occurred as follows:

 L H S P 1:N 4 46 765 .3115 3 3 39 467 .0081 123 2 25 226 .0009 1135

The author observed 46 successes (hits, H), in a sample of 765 phrases of at most 4-words in length. Of these 765 phrases, there were 467 phrases of 3-words or less, of which 39 were successes. Of these 467 phrases, there were 226 phrases of two words or less, of which 25 were successful Theomatics events. The author's task is to compute the statistical probability of each of these occurrences as if there was no design in Theomatics, if it were merely a random phenomenon.

The author, in calculating the p-values (P) for the above results and the associated odds 1:N, does not utilize the sample sizes given, which are the actual sample sizes, but attempts a reduction in each sample size using a Word Length Average (WLA) statistic. The WLA is the average number of base words per phrase (discounting articles and beginning conjunctions). He claims that it is appropriate to remove phrases from the sample until the WLA of the sample, WLAS, is the same as the WLA of the successful phrases (hits) obtained from it, WLA H. This claim is made upon an assumption that the expected value of WLAS, MS, is not more than MH, the expected value of WLAH.

Assumption: MS < MH

To support this claim, the author states (in uncorrected figures): "Theomatics discovered a total of 46 hits of 2.37 WLA . There are 434 possible phrase combinations with the same 2.37 WLA (the number pool) from which Theomatics could have (emphasis mine) derived its data. Our objective here is to find out what the probability is of finding 46 hits out of a number pool of 434 numbers." (Lk15, p.37) He states further: "Calculating the probability according to the length of the phrase -- matching the WLA of both the Theomatic hits and the 434 number pool -- is a totally fair, honest, and objective way to figure the p-value." (p. 44) Evidently, no justification is actually proposed by the author; he merely states that he is correct. A formal analysis of the validity of this assumption is warranted. First, we will complete our explanation of the author's methodology.

It so happens that WLAS is larger than WLAH, contrary to the way the author expects Theomatics to occur normally in such an experiment, so his procedure is simply to remove some of the longer (4-word) phrases from the 4-word sample, regardless whether they were successful trials or not, until WLAS equals WLAH. Eliminating all of the 4-word phrases reduces the sample from 765 to 434 giving 2.413 for WLAS. Since this is yet greater than his target of 2.37, he continues removing 3-word phrases from the sample until it yields the appropriate WLA (p. 35). The author considers this "par for the course," and states that it is now possible to test his hypothesis that God designed Theomatics (p. 35, emphasis his): "This figure of 434 now constitutes the number pool -- it will now give us an accurate comparison of Theomatics against the null hypothesis --  and enable us  to come up with an accurate p factor."

The author follows this activity by recalculating the p-values without using the WLA (p. 44), attempting to demonstrate that his conclusions do not ultimately depend upon this procedure, stating that the more obvious statistics -- those that any practicing statistician would naturally use -- still yield significant results.

Though the (uncorrected) odds for the 4-word phrases, as noted above, are 1 in 3, the author claims that the p-value for the 4-word phrases without considering WLA is 0.00014856, or odds of 1 in 6,731 (p. 44). He apparently arrived at this figure by using the sample size relevant to the 3-word phrases (467) by mistake instead of using the original sample of 765 4-word phrases as he should have. This appears to be an oversight since he subsequently correctly uses the 3-word sample size in determining the p-value for the hits of  phrases 3 words or less in length.  In any case, he states that the final p-value for the 4-word phrases without using WLA to adjust sample size is 1 in 1,101,684 (p. 44). It should have been reported as 1 in 525.

The (uncorrected) final p-values (P=PHxPC) without WLA, with odds 1:N, and general significance O (being the average number of random trials needed to get this kind of result if Theomatics does not exist), obtained from the p-values of the hits (PH) and clustering (PC), the number of hits ( H), the phrase sums (S), along with the actual clustering results (0, 1, 2) for each phrase length, are as follows:

 L H S 0 1 2 PH PC P 1:N O 4 46 765 16 21 9 .3115 .00611 .00190 525 1.06 3 39 467 15 16 8 .0081 .00563 .00005 21,840 11.24 2 25 226 11 10 4 .0009 .00452 4.0E-6 251,340 616.96

In concluding his analysis, the author uses this WLA sample-reduction technique to determine the statistical odds of obtaining the 3-word and 2-word hits in his experiment. He thus finds the odds of the 3-word experiment to be 1:261,205,726 (p. 47) which is the result that he formally publishes as the result of his experiment, and he finds the 2-word result to have odds of one in billions (p. 48, detailed also in reporting errors).

Analysis

In light of the above errors in analysis, the use of the WLA is clearly essential to the author's published conclusion. The 4-word and 3-word results of the experiment without considering WLA are within the range of the random benchmark derived from the maximum order statistic O were our context random (see our Methodology analysis). The 2-word phrase result would be considered very unusual, but not as sensational as the author's claim. The author is unable to solidly reject his null hypothesis without this sample-reduction procedure. The billions-to-one odds are impressively published, but are not quite so apparent from analysis of the experimental results apart from the author's use of the WLA . Though the author's analysis stands or falls with such use of the WLA, he offers no explicit justification for his approach.

As stated earlier, the author's claim that it is valid to adjust the sample size based on WLA implies he thinks the mean WLA for hits is expected to be at least as large as the WLA of the relevant sample, such that, on average, if many like tests were conducted in similar contexts, the WLA's of the resulting phrase samples would be no larger than the average word length of the hits obtained from them.

The reasoning is implied from the very procedure the author employs. Reducing the sample size to something the author would "expect" to be a "normal" or "reasonable" sample size implies that the sample size observed in the experiment is somehow known to be unusual or inappropriate. Instead of discarding the experimental context as inappropriate to display the evident statistical significance of Theomatics, and looking for another scenario that fits more nicely with his expectation, the author conveniently prefers to adjust the sample size.

On average, he expects that the WLA of the sample (WLAS) should not be larger than that of the hits (WLAH), it should be smaller, so, naturally, he feels the sample obtained in this particular Theomatic experiment is highly unusual and can be adjusted without adversely affecting the validity of the resulting conclusion. This implies he feels the expected WLA of the sample (MS ) no larger than the expected WLA of the hits (MH), so he is able to justify a presumed sample size that "could have" occurred having a WLAS that is upper-bounded by the WLAH of the hits. Being "conservative" so as not to corrupt his conclusion, he only reduces the sample size so as to obtain an equivalent WLA to that of the hits, not making the sample WLAS arbitrarily smaller than this WLAH bound like he expects it to be.

This reasoning clearly implies the author feels that Theomatic events tend to occur more frequently in longer phrases than in shorter ones, resulting in a WLAH for the hits that exceeds the WLAS of the sample on average. This comprises the author's "justification" to use the WLA to reduce the sample size. A summary review of the logic follows:

• Requiring WLAS to be < WLAH is only valid if MS < MH.
• MS < MH is only true if longer phrases tend to give more hits.

The necessary question is this: Is the author's assumption valid?

To answer this question, we simply observe, as the author notes, that the (uncorrected) data in this particular experiment indicate otherwise:

 L Sample Words Sample Size Sample WLA Hit Words Hits Hit WLA 4 2319 765 3.031 109 46 2.370 3 1127 467 2.413 81 39 2.077 2 404 226 1.788 39 25 1.560

Clearly, in each case the WLA of the sample is significantly larger than the WLA of the hits. The author therefore reduces the sample size in each case to something he expects "could have" been observed, and thus obtains incredible statistical significance for the Theomatic phenomenon. Again, the author's entire analysis stands or falls based upon this reasoning.

Further, we observe the following: the author plainly emphasizes the fact, based upon thousands of Theomatic instances he has carefully observed, collected, and analyzed over the last two decades, that Theomatic occurrences clearly favor the smaller phrases, obtaining a higher percentage of hits from them: "The one major factor that makes Theomatics stand tall -- is the shortness of the phrases that produce the Theomatic hits." (p.35) Again (p. 44), "The real power of Theomatics is the shortness and explicitness of the theomatic phrases and hits. After all, if some sort of Intelligence factor is at work here, then we would expect short and explicit -- one, two, and three word phrases, to produce the most significant results. It is when we look at that aspect, that the p-values literally go ballistic . This is true across the board -- from hundreds of individual studies in my files consisting of thousands of features. The reason for this, is that as one expands outwardly, the patterns dissipate."

In making such statements, the author implies that WLAH will generally be smaller than WLAS (or that MS > MH ); the smaller the phrase length being considered, the higher the proportion of hits and the resulting statistical significance of the results. This implies that the percent proportion of hits  (% ) will be larger for the smaller phrases than for the larger ones, so the mean of WLAH is less than the mean of WLAS. We can easily observe that this conclusion appears valid here.

 L Hits S % WLAS WLAH D 2 25 226 11 1.788 1.560 .228 3 39 467 8 2.413 2.077 .336 4 46 765 6 3.031 2.370 .662

The larger the allowed phrase length, the larger the actual difference D between the WLA of the sample and its hits (D=WLA S-WLAH). The author's observation that Theomatics "dissipates" in WLA performance when moving toward longer phrases appears correct. What is observed in the experiment appears to be appropriate to use as a general assumption about the behavior of Theomatics, being totally consistent with thousands of instances observed  by the author.

These facts flatly contradict the author's implicit assumption that WLAS can legitimately be upper-bounded by WLAH. In fact, if WLAS ever did fall below WLAH , this would certainly be an oddity... not the norm... based upon the author's own data and claims.

In addition to granting him the sensational claim of unbelievable statistical odds, the author's sample reduction procedure conveniently permits him to consider successful trials in his calculation of the p-value that are not retained in his sample, which contradicts the very definition of a sample . Such mathematics is quite creative, to say the least.

Theoretical Conclusion

The author's reduction of the sample based on WLA implies a contradiction in the definition of his experiment: his assumption is inconsistent with his own claims and with experimental data. His reasoning and the implied conclusion, the billions-to-one odds of Theomatics, cannot be accepted. One simply cannot correctly say that 46 hits were obtained from a sample of 434 phrases (p.38), or that 39 hits were obtained in 170 phrases (p.46), or that 25 hits were obtained in a sample of 117 (p. 48). While it is certainly true, though quite unlikely, that such results "could have" occurred in an hypothetical test of Theomatics, this fact is irrelevant: these results did not occur in the test that was conducted, apparently would not normally occur in any such experiment, and should not be considered in determining p-values in Theomatics.

The Facts

In order to determine the probability of the Theomatic phenomenon observed by the author, one must correct the errors the author made in his analysis. The correct results are:

 L H S 0 1 2 PH 1:NH PC P 1:N O 4 53 683 12 22 19 .0100 100 .80116 .00797 125 1.002 3 35 412 10 16 9 .0091 110 .18664 .00170 589 1.078 2 19 195 6 10 3 .0128 78 .09000 .01572 864 1.150

The chart shows the phrase length L, the sample size S, the clustering results (0, 1, 2), the probability of the hits PH, the odds of this number of hits occuring in such a sample 1:NH (NH = 1/ PH) the probability of the cluster distribution PC , the total probability of the hits and clustering P (P = PH X PC), the final odds 1:N (N = 1/ P), and the representative statistic O (being the average number of random trials needed to get this kind of result if Theomatics does not exist).

The correct way way to determine the statistical significance of the results of any experiment is to look carefully at what actually occurred in the experiment. Since the author has formally stated that the experiment is to consider results for all phrases of 4 words or less (p.22), the first O statistic obtained from the 4-word phases, 1.002, would technically be the result of his experiment. Therefore the results of this experiment are such that they would be expected on average to occur in nearly every single test if Theomatics were random.

The author's published claim of odds of 1 in 261 million corresponds to the final p-value for the 3-word phrases, reducing to 1 in 589 when his errors are corrected. The comparable O statistic is 1.078, so one would this result 93% of the time. Even so, clearly, both results are well below that of the MOS (maximum order statistic) benchmark of 2, or every other test, which represents odds of 1 in 3,466.

The last result, that of the 2-word phrases, is similarly insignificant, as observed in extensive testing.

The Inevitable Final Conclusion

The above results are all certainly well within what might be expected in a random context: the null hypothesis cannot be rejected... doing so is not even a consideration. The correct conclusion to draw in this Luke 15 analysis is that no Theomatic significance is evident at all. No other conclusion may be deemed correct, much less "conservative."